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Abstract. We study pure-strategy Nash equilibria in multi-player concurrent determin¬ 
istic games, for a variety of preference relations. We provide a novel construction, called 
the suspect game, which transforms a multi-player concurrent game into a two-player turn- 
based game which turns Nash equilibria into winning strategies (for some objective that 
depends on the preference relations of the players in the original game). We use that 
transformation to design algorithms for computing Nash equilibria in finite games, which 
in most cases have optimal worst-case complexity, for large classes of preference relations. 
This includes the purely qualitative framework, where each player has a single tu-regular 
objective that she wants to satisfy, but also the larger class of semi-quantitative objectives, 
where each player has several cu-regular objectives equipped with a preorder (for instance, 
a player may want to satisfy all her objectives, or to maximise the number of objectives 
that she achieves.) 


1. Introduction 

Games (and especially games played on graphs) have been intensively used in computer 
science as a powerful way of modelling interactions between several computerised sys¬ 
tems |39l I24j . Until recently, more focus had been put on the study of purely antagonistic 
games (a.k.a. zero-sum games), which conveniently represent systems evolving in a (hostile) 
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environment. In this zero-sum games setting, the objectives of both players are opposite: 
the aim of one player is to prevent the other player from achieving her own objective. 

Over the last ten years, games with non-zero-sum objectives have come into the picture: 
they allow for conveniently modelling complex infrastructures where each individual system 
tries to fulfil its own objectives, while still being subject to uncontrollable actions of the 
surrounding systems. As an example, consider a wireless network in which several devices 
try to send data: each device can modulate its transmit power, in order to maximise its 
bandwidth and reduce energy consumption as much as possible. In that setting, focusing 
only on optimal strategies for one single agent may be too narrow. Game-theoreticians 
have defined and studied many other solution concepts for such settings, of which Nash 
equilibrium [35] is the most prominent. A Nash equilibrium is a strategy profile where no 
player can improve the outcome of the game by unilaterally changing her strategy. In other 
terms, in a Nash equilibrium, each individual player has a satisfactory strategy. Notice 
that Nash equilibria need not exist or be unique, and are not necessarily optimal: Nash 
equilibria where all players lose may coexist with more interesting Nash equilibria. Finding 
constrained Nash equilibria (e.g., equilibria in which some players are required to win) is 
thus an interesting problem for our setting. 

In this paper, we report on our recent contributions on the computation of Nash equi¬ 
libria in concurrent games (preliminary works appeared as |U E] E]). Concurrent games 
played on graphs are a general model for interactive systems, where the agents take their 
decision simultaneously. Therefore concurrent games subsume turn-based games, where in 
each state, only one player has the decision for the next move. One motivation for concur¬ 
rent games is the study of timed games (which are games played on timed automata O H] ): 
the semantics of a timed game is naturally given as a concurrent game (the players all choose 
simultaneously a delay and an action to play, and the player with the shortest delay decides 
for the next move—this mechanism cannot be made turn-based since we cannot fix a priori 
the player who will choose the smallest delay); the region-based game abstraction which 
preserves Nash equilibria also requires the formalism of concurrent games [HE]. Multi-agent 
infrastructures can be viewed as distributed systems, which can naturally be modelled as 
concurrent games. 

Our contributions. The paper focuses on concurrent deterministic games and on pure 
Nash equilibria, that is, strategy profiles which are deterministic (as opposed to randomised). 
In this work we assume strategies only depend on the set of states which is visited, and not 
on the actions that have been played. This is a partial-information hypothesis which we 
believe is relevant in the context of distributed systems, where only the effect of the actions 
can be seen by the players. We will discuss in more detail all these choices in the conclusion. 

In the context exposed above, we develop a complete methodology for computing pure 
Nash equilibria in (finite) games. First, in Section 0] we propose a novel transformation of 
the multi-player concurrent game (with a preference relation for each player) into a two- 
player zero-sum turn-based game, which we call the suspect game. Intuitively, in the suspect 
game, one of the players suggests a global move (one action per player of the original game), 
with the aim to progressively build a Nash equilibrium; while the second player aims at 
proving that what the first player proposes is not a Nash equilibrium. This transformation 
can be applied to arbitrary concurrent games (even those with inhnitely many states) and 
preference relations for the players, and it has the property that there is a correspondence 
between Nash equilibria in the original game and winning strategies in the transformed 
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Objective 

Value 

(Constrained) Existence of Nash Eq. 

Reachability 

P-c. [32] 

NP-c. (Sect. EU) 

Safety 

P-c. [32| 

NP-c. (Sect. [52|) 

Biichi 

P-c. [32| 

P-c. (Sect.jTSj) 

co-Biichi 

P-c. [32| 

NP-c. iSect.lOll 

Parity 

UPn co-UP[28| 

Pf P-c.^(Sect. ESD 

Streett 

co-NP-c. [E] 

P^'^-h. and in PSPACE 

Rabin 

NP-c. mi 

Pj"^-c. (Sect.lMD 

Muller 

PSPACE-c. [27| 

PSPACE-c. 

Circuit 

PSPACE-c. [27| 

PSPACE-c. (Sect.ESD 

Det. Biichi Automata 

P-c. 

PSPACE-h. (Sect.ETj) and in EXPTIME 

Det. Rabin Automata 

NP-c. 

PSPACE-h. and in EXPTIME (Sect. EZj) 


Table 1. Summary of the complexities for single objectives 


Preorder 

Value 

Existence of NE 

Constr. Exist, of NE 

Maximise, Disj. 

Subset 
Conj., Lexicogr. 
Counting 
Mon. Bool. Circuit 

Boolean Circuit 

P-c. iSectl6.2ll 

P-c. (Sect. 16.311 

P-c. (Sect. 16.3p 
coNP-c. (Sect. [6j311 
coNP-c. fSect. [RDI 
PSPACE-c. (Sect. EH) 

P-c. (Sectl6.2j) 

P-c. fSectl6.2l) 

P-h., in NP (Sect. EH) 
NP-c. (Sect. EH 
NP-c. (Sect. EH 
PSPACE-c. (Sect. EH) 

P-c. (Sectl6.2ll 

P-c. (SectEH 
NP-c. (Sect. EH 
NP-c. (Sect. EH 
NP-c. (Sect. EH 
PSPACE-c. (Sect. EH 


Table 2. Summary of the results for ordered Biichi objectives 


Preorder 

Value 

(Constrained) Exist, of NE 

Disjunction, Maximise 

P-c. (Sect. 17.21) 

NP-c. (Sect. EH 

Subset 

PSPACE-c. (Sect. EH 

NP-c. (Sect. EH 

Conjunction, Counting, Lexicogr. 

PSPACE-c. (Sect. EH 

PSPACE-c. (Sect. EH 

(Monotonic) Boolean Circuit 

PSPACE-c. (Sect. EH 

PSPACE-c. (Sect. EH 


Table 3. Summary of the results for ordered reachability objectives 


two-player turn-based game. The winning condition in the suspect game of course depends 
on the preference relations of the various players in the original game. 

Then, using that construction we develop (worst-case) optimal-complexity algorithms 
for deciding the existence of (constrained) Nash equilibria in finite games for various classes 
of preference relations. In Section O we focus on qualitative w-regular objectives, i.e., 
preference relations are given by single objectives (which can be reachability, Biichi, parity, 
etc), and it is better for a player to satisfy her objective than to not satisfy her objective. 
We prove the whole set of results which are summarised in the second column of Tabled] (the 
first column summarises the complexity in the zero-sum two-player setting - called the value 
problem). Among the results obtained this way, the constrained Nash equilibrium existence 
problem is NP-complete in finite games with single reachability or safety objectives, while 
it is PTIM E-complete for single Biichi objectives. 
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Figure 1. A simple game-model for the wireless network 

In Sections E] and [71 we extend the previous qualitative setting to the semi-quantitative 
setting of ordered objectives. An ordered objective is a set of Biichi (or reachability) objec¬ 
tives and a preorder on this set. The preference relation given by such an ordered objective 
is then given by the value of the plays (w.r.t. the objectives) in that preorder. Preorders 
of interest are for instance conjunction, disjunction, lexicographic order, counting preorder, 
maximise preorder, subset preorder, or more generally preorders given as Boolean circuits. 
We provide algorithms for deciding the existence of Nash equilibria for ordered objectives, 
with (in most cases) optimal worst-case complexity. These algorithms make use of the 
suspect-game construction. The results are listed in Table [2] for Biichi objectives and in 
Table [3] for reachability objectives. 

Examples. Back to the earlier wireless network example, we can model a simple discretised 
version of it as follows. From a state, each device can increase (action 1) or keep unchanged 
(action 0) its power: the arena of the game is represented for two devices and two levels of 
power on Figured] (labels of states are power levels). This yields a new bandwidth allocation 
(which depends on the degradation due to the other devices) and a new energy consumption. 
The satisfaction of each device is measured as a compromise between energy consumption 
and bandwidth allocated, and it is given by a quantitative payoff functiono This can be 
transformed into Biichi conditions and a preorder on them. There are basically two families 
of pure Nash equilibria in this system: the one where the two players choose to go and stay 
forever in state (1,1); and the one where the two players go to state (2, 2) and stay there 
forever. 

We describe another example, the medium access control, that involves qualitative 
objectives. It was first given a game-theoretic model in m- Several users share the access 

^The complexity class is defined in terms of Turing machine having access to an oracle; oracle are 
artihcial devices that can solve a problem in constant time, thus hiding part of the complexity of the overall 
problem. The class is the class of problems that can be solved in polynomial time by a deterministic 
Turing machine which has access to an oracle for solving NP problems. The class is the subclass where, 
instead of asking a sequence of (dependent) queries to the oracle, the Turing machine is only allowed to ask 
one set of queries. We refer to |36l I44 | for more details. 

^The (quantitative) payoff for player i can be expressed by payoff^ = where ■ji is the 

signal-to-interference-and-noise ratio for player i, R is the rate at which the wireless system transmits the 
information in bits per seconds and L is the size of the packets in bits ([37]). 
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Figure 2. A simple game-model for the medium access control 

to a wireless channel. During each slot, they can choose to either transmit or wait for the 
next slot. If too many users are emitting in the same slot, then they fail to send data. 
Each attempt to transmit costs energy to the players. They have to maximise the number 
of successful attempts using the energy available to them. We give in Figure [2] a possible 
model for that protocol for two players and at most one attempt per player and a congestion 
of 2 (that is, the two players should not transmit at the same time): each state is labelled 
with the energy level of the two players, and the number of successful attempts of each 
of the player. There is several Nash equilibria, and they give payoff 1 to every player; it 
consists in going to state (0,1,0,1) by not simultaneously transmitting. 

Related work. Game theory has been a very active area since the 1940’s, with the pio¬ 
neering works of Von Neumann, Morgenstern |43] . Nash m and Shapley [38]. It has had 
numerous uses in various domains, ranging from economics to human sciences and logic. 
Equilibria are a central concept in (non-zero-sum) games, as they are meant to represent 
rational behaviours of the players. Many important results about existence of various kinds 
of equilibria in different kinds of games have been established |l3l|35l[20|. 

Eor applications in logic and computer science, games played on graphs have received 
more focus; also, computer scientists have been mostly looking for algorithmic solutions for 
deciding the existence and effectively computing equilibria and e-equilibria [EmniiiQ]. 

Eor two-player concurrent games with Biichi objectives, the existence of e-equilibria 
(in randomised strategies) was proved by Chatterjee |I0|. However, exact Nash equilibria 
need not exist; turn-based games with Biichi objectives are an important subclass where 
Nash equilibria (even in pure strategies) always exist |15] . When they exist, Nash equilibria 
need not be unique; equilibria where all the players lose can coexist with equilibria where 
some (or all) of them win. Ummels introduced constrained Nash equilibria, i.e., Nash equi¬ 
libria where some players are required to win. In particular, he showed that the existence of 
constrained Nash equilibria can be decided in polynomial time for turn-based games with 
Biichi objectives m- In this paper, we extend this result to concurrent games, and to 
various classes of w-regular winning objectives. For concurrent games with w-regular ob¬ 
jectives, the decidability of the constrained Nash equilibrium existence problem w.r.t. pure 
strategies was established by Fisman et al. m, but their algorithm runs in doubly expo¬ 
nential time, whereas our algorithm runs in exponential time for objectives given as Biichi 
automata. Finally, Ummels and Wojtczak [42] proved that the existence of a Nash equilib¬ 
rium in pure or randomised strategies is undecidable for stochastic games with reachability 
or Biichi objectives, which justifies our restriction to concurrent games without probabilistic 
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transitions. They also proved a similar undecidability result for randomised Nash equilibria 
in non-stochastic games [H], hence we consider only pure-strategy Nash equilibria. 

Several solution concepts have been defined and studied for games on graphs. In par¬ 
ticular, secure equilibria mm are Nash equilibria where besides satisfying their primary 
objectives, the players try to prevent the other players from achieving their own (primary) 
objectives. Notice that our results in Sect. 16.41 and Sect. 17.11 do apply to such kinds of 
lexicographic combination of several objectives. 

Temporal logics can also be used to express properties of games. While ATL [2] 
can mainly express only zero-sum properties, other logics such as ATL with strategy con¬ 
texts (ATLsc) [E] and Strategy Logic (SL) [HI |33] can be used to express rich properties 
in a non-zero-sum setting. In terms of complexity however, model checking for such logics 
has high complexity: Nash equilibria can be expressed using one quantifier alternation (an 
existential quantification over strategy profiles followed with a universal quantification over 
deviations); model checking this fragment of ATLsc or SL is 2-EXPTIME-complete. 

2. Definitions 

2.1. General definitions. In this section, we fix some definitions and notations. 

Preorders. We fix a non-empty set P. A preorder over P is a binary relation < C P x P 
that is reflexive and transitive. With a preorder <, we associate an equivalence relation ~ 
defined so that a ~ 6 if, and only if, a < 6 and b < a. The equivalence class of a, written [a]<, 
is the set {6 € P I a ~ 6}. We also associate with < a strict partial order -< dehned so that 
a ^ 6 if, and only if, a < 6 and b % a. A preorder < is said total if, for all elements a,b £ P, 
either a < b, or b < a. An element a in a subset P' C P is said maximal in P' if there is 
no b £ P' such that a P 6; it is said minimal in P' if there is no 6 € P' such that b a. 
A preorder is said Noetherian (or upwards well-founded) if any subset P' C P has at least 
one maximal element. It is said almost-well-founded if any lower-bounded subset P' C P 
has a minimal element. 

Transition systems. A transition system is a pair <S = (States, Edg) where States is a set 
of states and Edg C States x States is the set of transitions. A path vr in <S is a sequence 
('Si)o<i<n (where n £ N'''U{oo}) of states such that (sj, Sj+i) £ Edg for all i < n. The length 
of TT, denoted by |7r|, is n — 1. The set of finite paths (also called histories) of <S is denoted 
by Hist^, the set of infinite paths (also called plays) of S is denoted by Play^, and Path^ = 
Hist^ U Play^ is the set of all paths of S. Given a path vr = (si)o<i<n and an integer j < n, 
the j-th prefix (resp. j-th suffix, j-th state) of vr, denoted by 7r<j (resp. 7r>j, 7r=j), is the finite 
path (si)o<i<j+i (resp. the path {sj+i)o<i<n-j, the state Sj). If tt = (sj)o<i<n is a history, 
we write last(7r) = S|jr| for the last state of vr. If n' is a path such that (last(7r), ttI^q) £ Edg, 
then the concatenation tt • tt' is the path p s.t. p=i = 7r=j for i < |7r| and p=i = 
for i > |7r|. In the sequel, we write Hist 5 (s), Play 5 (s) and Path 5 (s) for the respective 
subsets of paths starting in state s. If tt is a play, Occ(7r) = {s | 7r=j = s} is the sets of 
states that appears at least once along vr and Inf(7r) = {s | Vh 3j > i. Tr=j = s} is the set 
of states that appears inhnitely often along tt. 
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2.2. Concurrent games. Our definition of concurrent 
games extends the definition in [2] by allowing for more than 
two players, each of them having a preorder over plays. 

Definition 2.1. A concurrent game is a tuple G = 

(States, Agt, Act, Mov, Tab, (;:jA)AgAgt), where States is a 
finite non-empty set of states, Agt is a finite set of players. 

Act is a finite set of actions, and 

• Mov: States x Agt ^ 2^'^* \ {*2*} is a mapping indicating 
the actions available to a given player in a given state; 

• Tab: States x Act^®’^ —>■ States associates, with a given 
state and a given move of the players (i.e., an element of 
Act^®*), the state resulting from that move; 

• for each A G Agt, is a preorder over States^^, called 
the preference relation of player A. 

Figure [3] displays an example of a finite concurrent game. Transitions are labelled with the 
moves that trigger them. We say that a move m^gt = {'fnA)A&Kgt € Act^®* is legal at s if 
mA £ Mov(s, A) for all A G Agt. A game is turn-based if for each state the set of allowed 
moves is a singleton for all but at most one player. 

In a concurrent game G, whenever we arrive at a state s, the players simultaneously 
select an available action, which results in a legal move m-Agt; the next state of the game is 
then Tab(s, ruAgt)- The same process repeats ad infinitum to form an infinite sequence of 
states. 

In the sequel, as no ambiguity will arise, we may abusively write G for its underly¬ 
ing transition system (States, Edg) where Edg = {(s,s') G States x States | Bm^gt G 
OAgAgt Tab(s,mAgt) = The notions of paths and related concepts in 

concurrent games follow from this identification. 

Remark 2.2 (Representation of finite games). In this paper, for finite games, we will 
assume an explicit encoding of the transition function Tab. Hence, its size, denoted |Tab|, 
is equal to X^seStates OAeAgt |Mov(s, A)| • |'log(|States|)]. Note that it can be exponential 
with respect to the number of players. A symbolic encoding of the transition table has been 
proposed in [30], in the setting of ATL model checking. This makes the problem harder, 
as the input is more succinct (see Remark 15.II and Proposition 15.21 for a formal statement). 
We would also have a blowup in our setting, and prefer to keep the explicit representation 
in order to be able to compare with existing results. Notice that, as a matter of fact, there 
is no way to systematically avoid an explosion: as there are possible 

transition functions, for any encoding there is one function whose encoding will have size 
at least [logdStatesD] • |States| • |Act|l"^®*L The total size of the game, is then 

\G\ = I States!-|-1 States! • |Agt| • |Act|-|- ^ n !Mov(s,A)!-riog(!States!)l+ ^ ! <a !• 

s€States AGAgt AEAgt 

The size of a preference relation <a will depend on how it is encoded, and we will make it 
precise when it is relevant. This is given in Section [2.51 

Definition 2.3. Let ^ be a concurrent game, and A G Agt. A strategy for A is a mapping 
aA- Histg —>■ Act such that cJA(7r) G Mov(last(7r), A) for all vr G Histg. A strategy ap for a 
coalition P C Agt is a tuple of strategies, one for each player in P. We write ap = {aA)AeP 



Figure 3. Representation of 
a two-player concurrent game 
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for such a strategy. A strategy profile is a strategy for Agt. We write Stratg for the set of 
strategies of coalition P, and Profg = Strat^^*. 

Note that, in this paper, we only consider pure (i.e., non-randomised) strategies. This is 
actually crucial in all the constructions we give (lasso representation in Subsection 13.11 and 
suspect-game construction in Section 0]). Notice also that our strategies are based on the 
sequences of visited states (they map sequences of states to actions), which is realistic when 
considering multi-agent systems. In some settings, it is more usual to base strategies on 
the sequences of actions played by all the players. When dealing with Nash equilibria, this 
makes a big difference: strategies based on actions can immediately detect which player(s) 
deviated from their strategy; strategies based on states will only detect deviations because an 
unexpected state is visited, without knowing which player(s) is responsible for the deviation. 
Our construction precisely amounts to keeping track of a list of suspects for some deviation. 

Let ^ be a game, P a coalition, and up a strategy for P. A path vr is compatible with 
the strategy ap if, for all A: < |7r|, there exists a move ruAgt such that 

(1) ruAgt is legal at 7r=fc, 

(2) niA = '^A{'^<k) for all A £ P, and 

(3) Tab(7r=fc,mAgt) = TT=k+i- 

We write Outg(iTp) for the set of paths (called the outcomes) in Q that are compatible with 
strategy up of P. We write Outg (resp. Out^) for the finite (resp. infinite) outcomes, and 
Outg(s,up), Outg(s,up) and Out^(s,up) for the respective sets of outcomes of up with 
initial state s. Notice that any strategy profile has a single infinite outcome from a given 
state. In the sequel, when given a strategy profile UAgt, we identify Out(s,UAgt) with the 
unique play it contains. 

A concurrent game involving only two players {A and B, say) is zero-sum if, for any 
two plays vr and tt', it holds vr if, and only if, tt' Pp Such a setting is purely 

antagonistic, as both players have opposite objectives. The most relevant concept in such a 
setting is that of winning strategies, where the aim is for one player to achieve her objectives 
whatever the other players do. In non-zero-sum games, winning strategies are usually too 
restricted, and the most relevant concepts are equilibria, which correspond to strategies 
that satisfy (which can be given several meanings) all the players. One of the most studied 
notion of equilibria is Nash equilibria [35], which we now introduce. 

2.3. Nash equilibria. We begin with introducing some vocabulary. When tt 7r^ we 
say that vr' is at least as good as vr for A. We say that a strategy a a for A ensures vr if 
every outcome of ua is at least as good as vr for A, and that A can ensure vr when such a 
strategy exists. 

Given a move ruAgt and an action m! for some player A, we write mAgt[A i—>■ m'] for the 
move riAgt with np = nip when B A and ha = m'. This is extended to strategies in the 
natural way. 

Definition 2.4. Let ^ be a concurrent game and let s be a state of G- A Nash equilibrium 
of G from s is a strategy prohle UAgt € Profg such that Out(s, UAgt[^ cy']) ;:5a Out(s, UAgt) 
for all players A £ Agt and all strategies u' £ StraL^. 
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So, Nash equilibria are strategy profiles where no sin¬ 
gle player has an incentive to unilaterally deviate from 
her strategy. 

Remark 2.5. Our definition of a Nash equilibrium re¬ 
quires any deviation to be worse or equivalent to the equi¬ 
librium. Another possible definition would have been to 
ask any deviation to be no better than the equilibrium. 

Those two definitions yield different notions of Nash equi¬ 
libria (unless the preorders are total), as illustrated in 
Figured! the black node n represents Out(s,UAgt), the 
light-gray area contains the nodes r)! such that n' n, 
while the dark-gray area contains the nodes n' for which 
n n'. 

This alternative definition would also be meaningful, 
and the techniques we develop in this paper could be 
adapted to handle such a variant. 

In this paper we will give a general construction that relates Nash equilibria in a game 
(which can be infinite) and winning strategies in a two-player turn-based game (called the 
suspect game), it is presented in Section dl We will then be mostly interested in solving the 
decision problems that we define next, when games are hnite. 

2.4. Decision problems we will consider. Given a concurrent game G = (States, Agt, 
Act,Mov, Tab, (;^A)AGAgt) and a state s G States, we consider the following problems: 

• Value problem: Given a player A and a play vr, is there a strategy aA for player A such 
that for any outcome p in G from s of a a, it holds vr P? 

• NE Existence problem: Does there exist a Nash equilibrium in G from s? 

• Constrained NE existence problem: Given two plays and for each player A, does 
there exist a Nash equilibrium in G from s whose outcome vr satisfies vr^ vr t^a 

all A G Agt? 

We will focus on decidability and complexity results of these three problems when games 
are finite, for various classes of preference relations. Complexity results will heavily rely on 
what preorders we allow for the preference relation and how they are represented. We have 
already discussed the representation of the game structure in Remark 12.21 We define and 
discuss now the various preference relations we will study, and explain how we encode the 
various inputs to the problems. 

2.5. Focus on the preference relations we will consider. We define the various classes 
of preference relations we will focus on in the rest of the paper. We begin with single¬ 
objective preference relations, and we then define a more general class of ordered objectives. 
We fix a game G = (States, Agt, Act, Mov, Tab, (;^A)AGAgt)- 



Figure 4. Two different no¬ 
tions of improvements for a non¬ 
total order. 
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2.5.1. Single-objective preference relations. 

Definition 2.6. An objective (or winning condition) is an arbitrary set of plays. A prefer¬ 
ence relation is single-objective whenever there exists an objective Q.a such that: p p' 
if, and only if, p' € IIa (we then say that p' is winning for A) or p ^ (we then say that 
p is losing for A). 

The setting of single-objective preference relations is purely qualitative, since a player 
can only win (in case the outcome is in her objective), or lose (otherwise). 

An objective can be specified in various ways. Next we will consider the following 
families of w-regular objectives: 

• A reachability objective is given by a target set T C States and the corresponding set 
of winning plays is defined by 

j^Reach ^ ^ I n T / 0}. 

• A safety objective is given by a target set T C States and the corresponding set of 
winning plays is defined by 

^Safety _ ^ Play I Occ(p) n T = 0}. 

• A Buchi objective is given by a target set T C States and the corresponding set of 
winning plays is defined by 

fiBiichi ^ ^ pjg^y I n T ^ 0}. 

• A co-Buchi objective is given by a target set T C States and the corresponding set of 
winning plays is defined by 

J^m-Biichi ^ ^ pi^y I n T = 0 }. 

• A parity objective is given by a priority function p: States i-7> |0, d] (where |0,d] = 
[0, d] n Z) with d G N, and the corresponding set of winning plays is defined by 

j-jPanty = |p g pj^y | min(Inf(p(/)))) is even}. 

• A Streett objective is given by a tuple (Qi, and the corresponding set of 

winning plays is defined by 

= {d e Play I Vf. Inf(p) n Q* ^ 0 ^ Inf(p) n i?* ^ 0). 

• A Rabin objective is given by a tuple (Qd -Rj)ieli,fc] and the corresponding set of winning 
plays is defined by 

= {d e Play I 3z. Inf(p) n Qi ^ 0 A Inf(p) n R* = 0}. 

• A Muller objective is given by a finite set C, a coloring function c: States i—>■ C, and a 
set C 2^. The corresponding set of winning plays is then defined by 

= {/> G Play I Inf(c(p)) G F}. 

We will also consider the following other types of objectives: 
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Figure 5. Boolean circuit defining the condition that either £3 appears 
infinitely often, or if £1 appears infinitely often then so does £ 2 - 


A circuit objective is given by a boolean circuit C with the set States as input nodes 
and one output node. A play p is winning if and only if C evaluates to true when the 
input nodes corresponding to states in Inf(p) are set to true, and all other input nodes 
are set to false. We write for the set of winning plays. 

Figure [ 5 ] displays an example of a circuit for the game of Figure [3l this Boolean circuit 
defines the condition that either t'a appears infinitely often, or if appears infinitely 
often then so does (. 2 - 

A deterministic Biichi automaton objective is given by a deterministic Buchi au¬ 
tomaton A = {Q,T,, 6, Qo, R), with S = States. Then the corresponding set of winning 
plays is defined by 

A deterministic Rabin automaton objective is given by a deterministic Rabin au¬ 
tomaton A = (Q, S, 6, qo, {Ei, fcj), with S = States. Then the corresponding set of 

winning plays is defined by 

^det-Rabin-aut = 

A Presburger-definable objective is given by a Presburger formula (j) with free vari¬ 
ables (Xs)sgstates- The corresponding set of winning plays is defined by 

j^Presb = {p g Play | <^(#s(p))^estates) = 0} 
where #s(/o) is the number of occurrence^ of state s along p. 


Qd/t-Biichi-aut = 


Encodings. For complexity issues we now make explicit how the various objectives are en¬ 
coded: 

• Reachability, safety, Buchi and co-Biichi objectives are given by a set T C States, they 
can therefore be encoded using |States] bits. 

• For parity objectives, we assume without loss of generality that d < 2 ■ |States]. The 
priority function has then size at most ]States] • ]'log(2 • ]States] -|- 1)]. 

• Street and Rabin objectives are given by tuples (Qi, Their sizes are given by: 

EiGiRfcl IQiiriog(]States])]. 

• Muller objectives are given by a coloring function and a set E. Its size is ]States] • 
]'log(]C'])] -|- ]J-'] • ]'log(]C])]. Note that thanks to the coloring function, this encoding can 
be exponentially more succinct than an explicit representation such as the one considered 
in [26]. 

^By convention, if s G Inf(p), and variable Xs appears in (j), then p ^ 
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A. Subset preorder 

B. Maximise preorder 

C. Counting preorder 


( 0 , 0 , 0 ) ( 0 , 0 , 1 ) ^ ( 0 , 1 , 0 ) ^ ( 0 , 1 , 1 ) ( 1 , 0 , 0 ) ( 1 , 0 , 1 ) ^ ( 1 , 1 , 0 ) ^ ( 1 , 1 , 1 ) 

D. Lexicographic order 

Figure 6 . Examples of preorders (for n = 3): dotted boxes represent equiv¬ 
alence classes for the relation defined as a^b^a<bAb< a; arrows 
represent the preorder relation < quotiented by 


• The size of objectives given by circuits, deterministic automata or Presburger formulas is 
that of the corresponding circuits, deterministic automata or Presburger formulas. 

Encodings of thresholds in inputs of the value and the constrained NE existence problems. 
Eor all the objectives except for those given by automata, whether a play p satisfies the 
objective or not only depends on the sets Occ{p) and Inf(/?). The various thresholds will 
therefore be encoded as such pairs (Occ,Inf). 

Eor deterministic-automata objectives, the thresholds will be also encoded as pairs of 
sets of states of the objectives, representing respectively the set of states which are visited 
and the set of states which are visited infinitely often. 

Eor the Boolean circuit objectives, whether a play p satisfies the objective or not only 
depends on the set Inf(p). Therefore we will use as encoding for the threshold a single set 
Inf. 

Eor the Presburger formulas objectives, we will use as encoding for the thresholds the 
Parikh image of the play (i.e., the number of visits to each of the states). 

2.5.2. Ordered objectives. We now turn to a more general class of preference relations, al¬ 
lowing for a semi-quantitative setting. 

Definition 2.7. An ordered objective is a pair uj = ((f2j)i<j<„, <), where, for every 1 < i < 
n, is an objective, and < is a preorder on {0,1}”. A play p is assigned a payoff vector 
w.r.t. that ordered objective, which is defined as payoff^(/9) = IpjpgQj G {0,1}” (where I5 
is the vector v such that Vi = 1 AA i £ S). The corresponding preference relation is then 
defined by p p' if, and only if, payoff^ (p) < payoff^ (p')- 

There are many ways of specifying a preorder. We define below the preorders on {0,1}” 
that we consider in the sequel. Eigure [6] displays four such preorders for n = 3. Eor the 
purpose of these definitions, we assume that max0 = — 00 . 
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(1) Conjunction: v < w if, and only if, either m = 0 for some l<i<n, orr(;j = l for 
all 1 < i < n. This corresponds to the case where a player wants to achieve all her 
objectives. 

(2) Disjunction: v < w if, and only if, either Vi = 0 for all 1 < i < n, or rcj = 1 for 

some 1 < i < n. The aim here is to satisfy at least one objective. 

(3) Counting: v < w if, and only if, |{i | Vi = 1}| < |{i | Wi = 1}|. The aim is to maximise 

the number of conditions that are satisfied; 

(4) Subset: v <w if, and only if, {i \ Vi = 1} (Z {i \ Wi = 1}: in this setting, a player will 
always struggle to satisfy a larger (for inclusion) set of objectives. 

(5) Maximise: v < w if, and only if, max{i | Uj = 1} < max{i | Wi = 1}. The aim is to 
maximise the highest index of the objectives that are satisfied. 

(6) Lexicographic: v < w if, and only if, either v = w, or there is 1 < i < re such that 
Vi = 0, Wi = 1 and vj = Wj for all I < j < i. 

(7) Boolean Circuit: given a Boolean circuit, with input from {0,1}^”, v <w if, and only 
if, the circuit evaluates 1 on input vi ... VnWi .. .Wn- 

(8) Monotonic Boolean Circuit: same as above, with the restriction that the input gates 
corresponding to v are negated, and no other negation appear in the circuit. 

In terms of expressiveness, any preorder over {0,1}"" can be given as a Boolean circuit: 
for each pair {v, w) with u < u), it is possible to construct a circuit whose output is 1 if, 
and only if, the input is vi... VnWi... Wn', taking the disjunction of all these circuits we 
obtain a Boolean circuit defining the preorder. Its size can be bounded by 2^"'"’“^re, which is 
exponential in general. But all the above examples ((l)-(6)) can be specified with a circuit 
of polynomial size. In Figure [7] we give a polynomial-size Boolean circuit for the subset 
preorder. In the following, for complexity issues, we will assume that the encoding of all 
preorders (l)-(6) takes constant size, and that the size of the preorder when it is given as 
a Boolean circuit is precisely the size of the circuit for input size re, where re is the number 
of objectives. 

A preorder < is monotonic if it is compatible with the subset ordering, i.e. if {i \ Vi = 
1 } C {i I rcj = 1} implies v < w. Hence, a preorder is monotonic if fulfilling more objectives 
never results in a lower payoff. All our examples of preorders except for the Boolean circuit 
preorder are monotonic. Moreover, any monotonic preorder can be expressed as a monotonic 
Boolean circuit: for a pair {v, w) with v < w, we can build a circuit whose output is 1 if, 
and only if, the input is ui... VnWi .. .Wn- We can require this circuit to have negation at 
the leaves. Indeed, if the input wj appears negated, and if wj = 0, then by monotonicity, 
also the input {v,w) is accepted, with Wi = Wi when i ^ j and Wj = I. Hence the negated 
input gate can be replaced with true. Similarly for positive occurrences of any vj. Hence 
any monotonic preorder can be written as a monotonic Boolean circuit. Notice that with 
Definition 12.41 any Nash equilibrium ciAgt for the subset preorder is also a Nash equilibrium 
for any monotonic preorder. 

Next we will be be interested in two kinds of ordered objectives, ordered reachability 
objectives, where all objectives are supposed to be reachability objectives, and ordered Biichi 
objectives, where all objectives are supposed to be Biichi objectives. Note that other classical 
objectives (parity, Streett, Rabin, Muller, etc.) can be equivalently described with a preorder 
given by a polynomial-size Boolean circuit over Biichi objectives. For instance, each set of 
a Muller condition can be encoded as a conjunction of Biichi and co-Biichi conditions. 
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Figure 7. Boolean circuit defining the subset preorder 

For ordered reachability (resp. Biichi) objectives, thresholds used as inputs to the 
various decision problems will be given by the set of states that are visited (resp. visited 
infinitely often). 

In Sections [6] and [71 we will be interested in games where, for every player A, the 
preference relation is given by an ordered objective uja = ^a)- We 

will then write payoff^ instead of payoff for the payoffs, and if p is a play, payoff (p) = 
(payoff^(p))AeAgt- 

2.6. Undecidability of all three problems for single Presburger-definable objec¬ 
tives. We end this section with an undecidability result in the quite general setting of 
Presburger-definable preference relations. 

Theorem 2.8. The value, NE existence and constrained NE existence problems are unde- 
cidable for finite games with preference relations given by Presburger-definable qualitative 
objectives. 

Proof. We first prove the result for the constrained NE existence problem, by encoding a 
two-counter machine. We hx a two-counter machine, and assume without loss of generality 
that the halting state is preceded by a non-zero test for the two counters (hence if the 
machine halts, the two counters have a positive value in the halting state). 

We begin with dehning a family of preorders. Fix two sets of states S and T; a play 
is said {S = T)-winning if the number of visits to S equals the number of visits to T, and 
both are finite. Formally, vr ;^5=r whenever vr is not {S = T)-winning, or n' is. 

We use such preorders to encode the acceptance problem for two-counter machines: the 
value of counter ci is encoded as the difference between the number of visits to Si and Ti, 
and similarly for counter C 2 . Incrementing counter Cj thus consists in visiting a state in Si, 
and decrementing consists in visiting Tp, in other terms, if instruction qj- of the two-counter 
machine consists in incrementing ci and jumping to qk', then the game will have a transition 
from some state qk to a state in S'!, and a transition from there to qk'. The game involves 
three players: Ai, A 2 and B. The aim of player Ai (resp. A 2 ) is to visit Si and Ti (resp. S 2 
and T 2 ) the same number of times: player Afs preference is fi;,Si=Ti- The aim of player B 
is to reach the state corresponding to the halting state of the two-counter machine. Due to 
the assumption on the two-counter machine, if B wins, then both Ai and yl 2 lose. 
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It remains to encode the zero-test: this is 
achieved by the module of Figure [8l In this module, 
player B tries to avoid the three sink states (marked 
in grey), since this would prevent her from reaching 
her goal. When entering the module, player B has 
to choose one of the available branches: if she de¬ 
cides to go to uf°, then Ai could take the play into 
the self-loop, which is winning for her if Si and Ti 
have been visited the same number of times in the 
history of this path, which corresponds to having 
Cj = 0; hence player B should play to uf° only if 
Ci 7 ^ 0, so that Ai has no interest in going to this 
self-loop. 

Similarly, if player B decides to go to n=°, 
player Ai has the opportunity to “leave” the main 
stream of the game, and go to Sj or ti (obviously Figure 8 . Testing whether Cj = 0. 

Si G Si and ti € Ti). If the numbers of visits to Si 

and Ti up to that point are different, then player Ai has the opportunity to make both 
numbers equal, and to win. Conversely, if both numbers are equal (i.e., Cj = 0), then going 
to Si or ti will be losing for Ai, whatever happens from there. Hence, if Cj = 0 when entering 
the module, then player B should go to u^°. 

One can then easily show that the two-counter machine stops if, and only if, there is a 
Nash equilibrium in the resulting game G, in which player B wins and players Ai and A 2 
lose. Indeed, assume that the machine stops, and consider the strategies where player B 
plays (in the first state of the test modules) according to the value of the corresponding 
counter, and where players Ai and A 2 always keep the play in the main stream of the game. 
Since the machine stops, player B wins, while players Ai and A 2 lose. Moreover, none of 
them has a way to improve their payoff: since player B plays according to the values of the 
counters, players Ai and A 2 would not benefit from deviating from their above strategies. 
Conversely, if there is such a Nash equilibrium, then in any visited test module, player B 
always plays according to the values of the counters: otherwise, player Ai (or A 2 ) would 
have the opportunity to win the game. By construction, this means that the run of the 
Nash equilibrium corresponds to the execution of the two-counter machine. As player B 
wins, this execution reaches the halting state. 

Finally, it is not difficult to adapt this reduction to involve only two players: players Ai 
and A 2 would be replaced by one single player A, in charge of ensuring that both conditions 
(for Cl and C 2 ) are fulfilled. This requires minor changes to the module for testing c* = 0: 
when leaving the main stream of the game in a module for testing counter Cj, player A 
should be given the opportunity (after the grey state) to visit states Ss-i or Ts^i in order 
to adjust that part of her objective. 

By changing the winning condition for Player B, the game G can also be made zero-sum: 
for this, B must lose if the play remains in the main stream forever without visiting the 
final state; otherwise, B loses if the number of visits to Si and ti are finite and equal for 
both i = 1 and i = 2] B wins in any other case. The objective of player A is opposite. It is 
not difficult to modify the proof above for showing that the two-counter machine halts if, 
and only if, player B has a winning strategy in this game. 
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Finally, by adding a small initial module de¬ 
picted on Figure [9] to this zero-sum version of the 
game Q, one can encode the halting problem for 
two-counter machines to the NE existence problem. 

Indeed, in the zero-sum game, there is exactly one 
Nash equilibrium, with only two possible payoffs (ei¬ 
ther A wins, or B wins). Now, assuming that A loses 
and B wins in state si, then there is a (pure) Nash 
equilibrium in the game extended with the initial 
module if, and only if, player B wins in the zero- 
sum game above. □ 

3. Preliminary results 

This section contains general results that will be applied later in various settings. In each of 
the statements, we give the restrictions on the games and on the preference relations that 
should be satisfied. 

3.1. Nash equilibria as lasso runs. We first characterise outcomes of Nash equilibria as 
ultimately periodic runs, in the case where preference relations only depend on the set of 
states that are visited, and on the set of states that are visited infinitely often. Note that 
oi-regular conditions satisfy this hypothesis, but Presburger relations such as the ones used 
for proving Theorem 12.81 do not. 

Proposition 3.1. Let Q = (States, Agt, Act, Mov, Tab, (;^y 4 )AeAgt) be a finite concurrent 
game sueh that, for every player A, it hold^ p p' as soon as Inf(p) = Inf(/9') and 
Occ{p) = Occ{p'). Let p G Play. If there is a Nash equilibrium with outcome p, then there 
is a Nash equilibrium with outcome p' of the form tt ■ such that p p' , and where |7r| 
and |r| are bounded by |Statesp. 

Proof. Let UAgt be a Nash equilibrium from some state s, and p be its outcome. We define 
a new strategy profile whose outcome from s is ultimately periodic, and then show 

that is a Nash equilibrium from s. 

To begin with, we inductively construct a history tt = ttovti .. .nn that is not too long 
and visits precisely those states that are visited by p (that is, Occ(7r) = Occ(p)). 

The initial state is vro = po = s. Then we assume we have constructed 7r<fc = tto ... vr^ 
which visits exactly the same states as p<_k' for some k'. If all the states of p have been 
visited in 7 r<fc then the construction is over. Otherwise there is an index i such that pi 
does not appear in 7 r<fc. We therefore define our next target as the smallest such i: we let 
t{T^<k) = min{i | Vj < fe. vr^ 7 ^ p*}. We then look at the occurrence of the current state vr^ 
that is the closest to the target in p: we let c{'K<k) = max{j < t( 7 r<fc) | tta, = pj}. Then 
we emulate what happens at that position by choosing 'Kj+i = Pc{-K<j)+i- Then iTk+i is 
either the target, or a state that has already been seen before in Ti<k, in which case the 
resulting 7r<fc_|_i visits exactly the same states as p<c( 7 r<fc)+i- 

At each step, either the number of remaining targets strictly decreases, or the num¬ 
ber of remaining targets is constant but the distance to the next target strictly decreases. 

^We recall that p p' if, and only if, p p' and p' .^a p- 



Copy of Q 


Figure 9. Extending the game with 
an initial concurrent module 
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Therefore the construction terminates. Moreover, notice that between two targets we do 
not visit the same state twice, and we visit only states that have already been visited, plus 
the target. As the number of targets is bounded by |States|, we get that the length of the 
path TT constructed thus far is bounded by 1 + |States| ■ (|States| — l)/2. 

Using similar ideas, we now inductively construct r = toti .. .Tm, which visits precisely 
those states which are seen infinitely often along p, and which is not too long. Let I 
be the least index after which the states visited by p are visited inhnitely often, i.e. I = 
min{i € N | Vj > L pj G Inf(/7)}. The run p>i is such that its set of visited states and its 
set of states visited inhnitely often coincide. We therefore dehne r in the same way we have 
dehned tt above, but for play /?>/. As a by-product, we also get c(r<fc), for k < m. 

We now need to glue vr and r together, and to ensure that r can be glued to itself, so 
that TT ■ is a real run. We therefore need to link the last state of tt with the hrst state 
of r (and similarly the last state of r with its hrst state). This possibly requires appending 
some more states to tt and r: we hx the target of vr and r to be tq, and apply the same 
construction as previously until the target is reached. The total length of the resulting 
paths tt' and t' is bounded by 1 -|- (|States| — 1) • (|States| + 2)/2 which is less than |States|^. 

We let p' = -k' ■ , and abusively write c{p'^f^) for c(7r<^) if A: < |7r'| and with 

k' = {k — 1 — Itt'I) mod \t'\ otherwise. We now dehne our new strategy prohle, having p' 
as outcome from s. Given a history h: 

• if h followed the expected path, i.e., h = p'^f^ for some k, we mimic the strategy at c(/i): 
'^Agt(^) ~ '^Agt(/Oc(/j))- This way, p' is the outcome of from s. 

• otherwise we take the longest prehx h<k that is a prehx of p', and dehne = 

^Agt{p'c[h<k) ' 

We now show that is a Nash equilibrium. Assume that one of the players changes 
her strategy while playing according to either the resulting outcome does not deviate 

from TT • r‘^, in which case the payoff of that player is not improved; or it deviates at some 
point, and from that point on, follows the same strategies as in UAgt- Assume that the 
resulting outcome is an improvement over p' for the player who deviated. The suffix of the 
play after the deviation is the suffix of a play of UAgt after a deviation by the same player. 
By construction, both plays have the same sets of visited and infinitely-visited states. Hence 
we have found an advantageous deviation from UAgt for one player, contradicting the fact 
that fJAgt is a Nash equilibrium. □ 

3.2. Encoding the valne problem as a constrained NE existence problem. We now 

give a reduction that will be used to infer hardness results for the constrained NE existence 
problem from the hardness of the value problem (as defined in Section l2.4h : this will be 
the case when the hardness proof for the value problem involves the construction of a game 
satisfying the hypotheses of the proposition. 

Proposition 3.2. Let Q = (States, Agt, Act, Mov, Tab, (;^A)AeAgt) be. a two-player zero- 
sum game played between players A and B, such that: 

• the preference relation for player A is total, Noetherian and almost-well-founded 
(see Section \KW; 

• Q is determined, i.e., for all play it: 

[3cta. Vo-fi. TT Out(fTA,o-B)] [VfJs. 3aA- vr Out(fTA, (Tb)]. 
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Let Q' be the (non-zero-sum) game obtained from Q by replacing the preference relation of 
player B by the one where all plays are equivalent. Then, for every state s, for every play 
TT from s, the two following properties are equivalent: 

(i) there is a Nash equilibrium in Q' from s with outcome p such that n P; 

(ii) player A cannot ensure tt from s in Q. 

Proof. In this proof, aA and cr^ (resp. as and a'^) refer to player -74 (resp. player-i?) 
strategies. Furthermore we will write Out(f 7 y 4 , ub) instead of Outg(s, {aA,crB))- 

We first assume there is a Nash equilibrium {aA^crs) in Q' from s such that vr 'f^A 
Oni{aA, cr b) ■ Since is total, Out(crA,UB) vr. Consider a strategy cr^ of player A in 
Q. As {(ta^ctb) is a Nash equilibrium, it holds that Out((T^,crB) ;f^A Out(cJA,us), which 
implies Out((T^,(Ts) vr. We conclude that condition (ii) holds. 

Assume now property (ii). As the preference relation is Noetherian, we can select 
tt "’" which is the largest element for which can be ensured by player A. Let aA be 

a corresponding strategy: for every strategy aB, Out( 17 ^ 4 , cjs). Towards a con¬ 

tradiction, assume now that for every strategy a'^, there exists a strategy such that 
7 r+ Out((T^, (T^). Consider the set S of such outcomes, and define vr' as its minimal ele¬ 
ment (this is possible since the order is almost-well-founded). Notice then that 7 r+ ^^4 tt', 
and also that for every strategy a'^, there exists a strategy a'jy such that tt' Out(cT( 4 , a'^). 
Then, as the game is determined, we get that there exists some strategy a'j^ such that 
for all strategy a'^, it holds that tt' Oni{a'j^,a'^). In particular, strategy a'j^ ensures 
tt', which contradicts the maximality of tt~^. Therefore, there is some strategy a'^ for 
which for every strategy < 7 ( 4 , 7 r+ y^A Out(< 7 ( 4 , ( 7 ^), which means Oni{a'j^, a'^) ■ We 

show now that [aA^cr'B) is a witness for property {i). We have seen on the one hand that 
vr"'' Out(( 774 , ( 7 ^), and on the other hand that Ont{a a, c^'b) ■ By hypothesis, 

vr"'' vr, which yields Out(( 7 y 4 , cr^) ^^4 tt. Pick another strategy < 7(4 for player A. We have 
seen that Out(( 7 ( 4 , cr^) 7 r+, which implies Out(( 7 ( 4 , ;^y 4 Out(< 7 ^ 4 , < 7 ^). This concludes 

the proof of {i). □ 

Remark 3.3. Any finite total preorder is obviously Noetherian and almost-well-founded. 
Also, any total preorder isomorphic to the set of non-positive integers is Noetherian and 
almost-well-founded. On the other hand, a total preorder isomorphic to {1/n | n € N"*"} is 
Noetherian but not almost-well-founded. 


3.3. Encoding the value problem as a NE existence problem. We prove a similar 
result for the NE existence problem. In this reduction however, we have to modify the game 
by introducing a truly concurrent move at the beginning of the game. This is necessary 
since for turn-based games with w-regular winning conditions, there always exists a Nash 
equilibrium |15] . hence the NE existence problem would be trivial. 

Let Q = (States, Agt, Act, Mov, Tab, (;^A)AGAgt) be a two-player zero-sum game, with 
players A and B. Given a state s of fyand a play tt from s, we define a game by 
adding two states sq and si, in the very same way as in Figure [H on page [HI From sq; 
A and B play a matching-penny game to either go to the sink state si, or to the state s 
in the game Q. We assume the same hypotheses than in Proposition 13.21 for the preference 
relation ;^y 4 . Let tt^ be in the highest equivalence class for ;;jy 4 smaller than tt (it exists since 
is Noetherian). In player B prefers runs that end in sii formally, the preference 
relation of player B in Gn is given by tt' tt" tt" = sq ■ sf M tt' ^ sq ■ sf. On the 
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other hand, player A prefers a path of ^ over going to si, if and only if, it is at least as 
good as vr: formally, the preference relation for player A in is given by sq • vr' 

So • vr" tt' vr", and sq • sf sq • vr+. 

Proposition 3.4. Let Q = (States, Agt, Act, Mov, Tab, (;^A)AeAgt) a two-player zero- 
sum game, with players A and B, such that: 

• the preference relation for player A is total, Noetherian and almost-well-founded; 

• Q is determined. 

Let s be a state and tt be a play in Q from s. Consider the game Gn defined above. Then 
the following two properties are equivalent: 

(i) there is a Nash equilibrium in Gn from sq; 

(ii) player A cannot ensure tt from s in G. 

In particular, in a given class of games, if the hardness proof of the value problem involves 
a game which satisfies the hypotheses of the proposition, and if Gn belongs to that class, 
then the NE existence problem is at least as hard as the complement of the value problem. 
Proof. Assume that player A cannot ensure at least vr from s in then according to 
Proposition 13.21 there is a Nash equilibrium (cja, ctb) in the game G' of Proposition 13.21 with 
outcome p such that vr P- Consider the strategy profile in Gn that consists 

in playing the same action for both players in sq; and then if the path goes to s, to play 
according to {aA,crB)- Player B gets her best possible payoff under that strategy profile. 
If A could change her strategy to get a payoff better than sq • vr"'', then it would induce 
a strategy in G' giving her a payoff better than p (when played with strategy ctb), which 
contradicts the fact that (cja, ctb) is a Nash equilibrium in G' ■ Therefore, (cr^^, crj) is a Nash 
equilibrium in Gn- 

Conversely, assume that A can ensure vr from s in and assume towards a contradiction 
that there is a Nash equilibrium {cr\,a]^) in Gn from sq- Then Ontg^{a\,a]^) does not end 
in si, otherwise player A could improve by switching to s and then playing according to a 
strategy which ensures vr. Also, Outg^ (< 7 ( 4 , cannot end in G either, otherwise player B 
would improve by switching to si. We get that there is no Nash equilibrium in Gn from sq; 
which concludes the proof. □ 

3.4. Encoding the constrained NE existence problem as an NE existence prob¬ 
lem. The next proposition makes a link between the existence of a Nash equilibrium where 
a player gets a payoff larger than some bound and the (unconstrained) existence of a Nash 
equilibrium in a new game. This will allow, in some specific cases, to infer hardness results 
from the constrained NE existence problem to the NE existence problem. 

The construction is inspired by the previous one, but it applies to a game with at least 
two players, and it applies to any two selected players as follows. Let G = (States, Agt, Act, 
Mov,Tab, (;iA)AeAgt) be a concurrent game, s be a state of p be a play from s, and Ai 
and Aj be two distinct players. We dehne the new game E{G, Ai, Aj, p) again in the same 
way as on EigureO Now, in sq, the two players Ai and Aj play a matching-penny game to 
either go to the sink state si, or to state s in game G. 

Eor player Aj, the preference relation in E{G, Ai, Aj, p) is given by such that 
•So • s‘f -<'a. So • vr and sq • vr sq ■ tt' ^ tt vr', for any path vr and vr' from s in G- 
Eor player Ai the preference relation is sq • vr sq • vr' vr ^Ai vr', for any path vr and 
vr' from s in G, and sq • s'J' ~Ai sq • p. Eor any other player Ak, the preference relation 
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E{Q, Ai, Aj,p) is given by so • vr so • vr' vr tt' for any path tt and n' from s in Q, 
and So • s^ so • P- 

Proposition 3.5. Let Q = (States, Agt, Act, Mov, Tab, (;:jA)AGAgt) be a concurrent game, 
let s be a state ofQ, and Ai and Aj he two distinct players participating to Q. Pick two plays 
TT and p from s such that p ^Ai If there is a Nash equilibrium in Q whose outcome is tt, 
then there is a Nash equilibrium in E{Q, Ai, Aj, p) whose outcome is so • vr. Reciprocally, if 
there is a Nash equilibrium in E{Q, Ai, Aj, p) whose outcome is sq ■ tt, then there is a Nash 
equilibrium in Q whose outcome is tt. 

Proof. Assume that there is a Nash equilibrium UAgt in G with outcome tt such that p ^Ai tt. 
Then so • s'J' so • tt. Consider the strategy profile in E{Q, Ai, Aj, p) that consists for Ai 
and Aj in playing different actions in so and when the path goes to s, to play according 
to (TAgt- Players Ai and Aj have no interest in changing their strategies in so, since for Aj 
all plays of G are better than so • sf, and for Ai the play sq • vr is better than so • sf. Hence, 
this is a Nash equilibrium in game E{G, Ai, Aj, p). 

Reciprocally, if there is a Nash equilibrium in E{0, Ai, Aj, p), its outcome cannot end 
in si, since Aj would have an interest in changing her strategy in so (all plays of G are then 
better for her). The strategies followed from s thus defines a Nash equilibrium in □ 

If we consider a class of games such that E[G, Ai, Aj, p) belongs to that class when G 
does, then the NE existence problem is then at least as hard as the constrained NE existence 
problem. Note however that the reduction assumes lower bounds on the payoffs, and we do 
not have a similar result for upper bounds on the payoffs. For instance, as we will see in 
Section m for a conjunction of Biichi objectives, we do not know whether the NE existence 
problem is in P (as the value problem) or NP-hard (as is the existence of an equilibrium 
where all the players are losing). 


4. The suspect game 

In this section, we construct an abstraction of a multi-player game ^ as a two-player zero- 
sum game Li, such that there is a correspondence between Nash equilibria in G and winning 
strategies in LL (formalised in forthcoming Theorem 14.511 . This transformation does not 
require the game to be finite and is conceptually much deeper than the reductions given in 
the previous section; it will allow us to use algorithmic techniques from zero-sum games to 
compute Nash equilibria and hence solve the value and (constrained) NE existence problems 
in various settings. 

4.1. Construction of the suspect game. We fix a concurrent game G = (States, Agt, 
Act, Mov, Tab, (;^A)AgAgt) for the rest of the section, and begin with introducing a few extra 
definitions. 

Definition 4.1. A strategy profile UAgt is a trigger profile for a play vr from some state s if, 
for every player A G Agt, for every strategy of player A, the path vr is at least as good 
as the outcome of (TAgt[^ (^' a \ from s (that is, Out(s, iTAgt[^ <7 a]) "t). 

The following result is folklore and a direct consequence of the definition: 

Lemma 4.2. A Nash equilibrium is a trigger profile for its outcome. Reciprocally, a strategy 
profile which is trigger profile for its outcome is a Nash equilibrium. 
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Definition 4.3 ([!]). Given two states s and s', and a move mAgt, the set of suspect players 
for (s, s') and mAgt is the set 

Susp((s, s'), mAgt) = {A & Agt | 3 m' G Mov(s,4.). Tab(s,mAgt[^ ^ ?li']) = s'}. 

Given a path p and a strategy profile o'Agt, the set of suspect players for p and UAgt is the 
set of players that are suspect along each transition of p, i.e., it is the set 


Susp(p,fTAgt) = |a G Agt Vi < IpI . A G Susp((p=i, yO=i+i), f7Agt(/o<t)) j- 


Intuitively, player A G Agt is a suspect for transition (s,s') and move mAgt if she can 
unilaterally change her action to activate the transition (s,s'): if s' ^ Tab(s,mAgt), then 
this may be due to a deviation from mAgt of any of the players in the set Susp((s, s'), mAgt), 
and no one else. If s' = Tab(s, mAgt), it may simply be the case that no one has deviated, so 
everyone is a potential suspect for the next moves. Similarly, we easily infer that player A is 
in Susp(p, fJAgt) if, and only if, there is a strategy such that Out(s, crAgt[A i-)- ci^]) = p. 

Note that the notion of suspect players requires moves and arenas to be deterministic, 
and therefore everything which follows assumes the restriction to pure strategy profiles and 
to deterministic game structures. 


We fix a play tt mQ. From game Q and play vr, we build the suspect game T-L{Q,7r), which 
is a two-player turn-based game defined as follows. The players in are named Eve 

and Adam. Since HiG, vr) is turn-based, its state space can be written as the disjoint union of 
the set V 3 controlled by Eve, which is (a subset of) States x 2^®*, and the set Vy controlled 
by Adam, which is (a subset of) States x x Act'*^®*. The game is played in the following 
way: from a configuration {s,P) in V 3 , Eve chooses a legal move mAgt from s; the next 
state is (s,P, mAgt); then Adam chooses some state s' in States, and the new configuration 
is {s',P n Susp((s, s'), mAgt))- In particular, when the state s' chosen by Adam is such 
that s' = Tab(s,mAgt) (we say that Adam obeys Eve when this is the case), then the new 
configuration is (s',P). 

We define projections proji and proj 2 from V 3 on States and 2^®*, resp., by proji{s, P) = 
s and proj 2 {s, P) = P. We extend these projections to paths in a natural way (but only using 
Eve’s states in order to avoid stuttering), letting proji{{so, Pq) ■ (sq, Pq, mo) • (si, Pi) • • •) = 
So • Si • • • . For any play p, proj 2 {p) (seen as a sequence of sets of players of G) is non¬ 
increasing, therefore its limit X{p) is well defined. We notice that if X{p) 7 ^ 0 , then proji{p) 
is a play in G- An outcome p is winning for Eve, if for all A G A(/j), it holds proji{p) vr. 
The winning region IF(^,vr) (later simply denoted by W when G and vr are clear from 
the context) is the set of configurations of P(^,vr) from which Eve has a winning strategy. 
Intuitively Eve tries to have the players play a Nash equilibrium, and Adam tries to disprove 
that it is a Nash equilibrium, by finding a possible deviation that improves the payoff of 
one of the players. 


4.2. Correctness of the suspect-game construction. The next lemma establishes a 
correspondence between winning strategies in P(^,vr) and trigger profiles (and therefore 
Nash equilibria) in G- 

Lemma 4.4. Let s be a state of G and tt be a play from s in G- The following two conditions 
are equivalent: 
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• Eve has a winning strategy in from (s,Agt), and its outcome p' from (s,Agt) 

when Adam obeys Eve is such that proji{p') = p; 

• there is a trigger profile for n in G from state s whose outcome from s is p. 

Proof. Assume there is a winning strategy for Eve in T-L{G,tt) from (s,Agt), whose 
outcome from (s, Agt) when Adam obeys Eve is p' with proji{p') = p. We define the strategy 
profile fJAgt according to the actions played by Eve. Pick a history 5 = S 1 S 2 • • • Sfe+i, with 
Si = s. Let h be the outcome of from s ending in a state of V 3 and such that proji{h) = 
Si • • • Sfc. This history is uniquely defined as follows: the first state of h is (si, Agt), and 
if its {2i + l)-st state is {si,Pi), then its ( 2 i + 2 )-nd state is {si, Pi, a 3 {h< 2 i+i)) and its 
{2i + 3)-rd state is (si+i,Pj fl Susp((si, Sj+i), iT 3 (/i< 2 i+i))). Now, write {sk,Pk) for the last 
state of h, and let h' = h ■ {sk, Pk,o' 3 {h)) ■ {sk+i,Pk H Susp((sfc, s^+i), ct 3 (/i))). Then we 
define (TAgt{g) = crj(h'). Notice that when (7 • s is a prefix of proji{p'), then g ■ s ■ <TAgt (5 ■ s) 
is also a prefix of proji{p'). In particular, Out(s,iTAgt) = proji{p') = p. 

We now prove that UAgt is a trigger profile for tt. Pick a player A G Agt, a strategy 
for player A, and let g = Out(s,iTAgt[^ <^a])- With a play g, we associate a play h 

in 'H{G,t^) in the same way as above. Then player A is a suspect along all the transitions 
of g, so that she belongs to A(/i). Now, as <73 is winning, proji{h) tt, which proves that 
(TAgt is a trigger profile. 

Conversely, assume that UAgt is a trigger profile for vr whose outcome is p, and define 
the strategy by cr^{h) = (TAgt(proji(/i)). Notice that the outcome p' of 1 T 3 when Adam 
obeys Eve satisfies proji{p') = p. 

Let p be an outcome of 1 T 3 from s, and A G \{p). Then A is a suspect for each transition 
along proji{p), which means that for all i, there is a move mf such that 

proj^{p=i+i) = Tah{proji{p=i),aAgt{proj^{p<i))[A mf]). 

Therefore there is a strategy < 7)4 such that proji{p) = Out(s,i7Agt[^ Since UAgt is 

a trigger profile for vr, it holds that proji{p) As this holds for any A G X{p), <73 is 

winning. □ 

We now state the correctness theorem for the suspect game construction. 

Theorem 4.5. Let G = (States, Agt, Act, Mov, Tab, (;^A)AeAgt) be a concurrent game, s be 
a state of G, and tt he a play in G- The following two conditions are equivalent: 

• there is a Nash equilibrium <7Agt from s in G whose outcome is tt. 

• there is a play p from (s,Agt) in PL{G,tt), 

(1) such that proji{p) = tt; 

(2) along which Adam always obeys Eve; and 

(3) such that for all indices i, there is a strategy for Eve, for which any play in 
p<i ■ Out(/ 9 =i, ( 7 (j) is winning for Eve. 

Proof. The Nash equilibrium is a trigger profile, and from Lemma 14.41 we get a winning 
strategy <73 in 'H{G,tt). The outcome p of 173 from s when Adam obeys Eve is such that 
TT = proii{p) is the outcome of the Nash equilibrium. Now for all prefix p<A, the strategy 
( 7 |: /i i-A (y^{p<i ■ h) is such that any play in p<i ■ Out(/ 9 =i, cr^) is winning for Eve. 

Conversely, let p' be a path in PL{G,tt) and assume it satisfies all three conditions. 
We define a strategy A 3 that follows p' when Adam obeys. Along p', this strategy is defined 
as follows: A 3 (/ 9 < 2 i) = ^-Agt such that T8h{proji{p'^f^,mAgt) = proji{pT^j_i). Such a legal 


PURE NASH EQUILIBRIA IN CONCURRENT DETERMINISTIC GAMES 


23 


' 4,0 ' 



Figure 10. A small part of the suspect game for the game of Figure [3] 

move must exist since Adam obeys Eve along p' by condition [2j Now, if Adam deviates from 
the obeying strategy (at step i), we make A 3 follow the strategy (given by condition [3]) , 
which will ensure that the outcome is winning for Eve. 

The outcomes of A 3 are then either the path p', or a path p” obtained by following a 
winning strategy after a prefix of p' . The path p" is losing for Adam, hence for all A G A(p'), 
p” p' . This proves that A 3 is a winning strategy. Applying Lemma 14.41 we obtain a 

strategy profile UAgt in Q that is a trigger profile for vr. Moreover, the outcome of UAgt 
from s is proii{p') (using condition [1]) , so that UAgt is a Nash equilibrium. □ 

Remark 4.6. Assume the preference relations of each player A in Q are prefix-independent, 
i.e., for all plays p and p', p p' iff for all indices i and j, p>i P>j- Then the winning 
condition of Eve is also prehx-independent, and condition [3] just states that p' has to stay 
within the winning region of Eve. Note that, for prehx-dependent preference relations, 
condition [3] does not reduce to stay within the winning region of Eve: for instance, for safety 
objectives, if the losing states of all the players have been visited then any prolongation will 
satisfy the condition, even though it might leave the winning region of Eve. 

Example 4.7. We depict on Figure fTOl part of the suspect game for the game of Figure [3l 
Note that the structure of does not depend on tt. Only the winning condition is 

affected by the choice of vr. 

In the rest of the paper, we use the suspect-game construction to algorithmically solve 
the NE existence problem and the constrained NE existence problem in hnite games for 
large classes of preference relations. Before that we carefully analyse the size of the suspect 
game when the original game is hnite. 

4.3. Size of the suspect games when the original game is finite. We suppose that 
G is hnite. At hrst sight, the number of states in T-L{G,t^) is exponential (in the number 
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of players of ^). However, there are two cases for which we easily see that the number of 
states of tt) is actually only polynomial: 

• if there is a state in which all the players have several possible moves, then the transition 
table (which is part of the input, as discussed in Remark 12.2|) is also exponential in the 
number of players; 

• if the game is turn-based, then the transition table is “small”, but there is always at most 

one suspect player (unless all of them are suspects), so that the number of reachable 
states in is also small. 

We now prove that, due to the explicit encoding of the set of transitions (recall Remark 12.21 
page ED, this can be generalised: 

Proposition 4.8. Let Q = (States, Agt, Act, Mov, Tab, (;^yl)y 4 eAgt) be a finite concurrent 
game and n be a play in Q. The number of reachable configurations from States x {Agt} 
in 'H{Q,'k) is polynomial in the size of Q. 

Proof The game 71(0,n) contains the state (s,Agt) and the states (s, Agt, ruAgt), where 
TTiAgt is a legal move from s; the number of these states is bounded by |States] -|- |Tab|. 
The successors of those states that are not of the same form, are the (t, Susp((s, t), ruAgt)) 
with t Tab(s,mAgt)- If some player A G Agt is a suspect for transition (s,t), then be¬ 
sides ruA, she must have at least a second action m', for which Tab(s, mAgt[A e-)- m']) = t. 
Thus the transition table from state s has size at least The successors 

of (t, Susp((s, t), ruAgt)) are of the form {t',P) or (t',P, ruAgt) where P is a subset of 
Susp((s, t), ruAgt); there can be no more than (|States| -|- |Tab|) • of them, 

which is bounded by (|States] -|- ]Tab]) • ]Tab]. The total number of reachable states is then 
bounded by (]States] -|- ]Tab]) • (1 -|- (]States] -|- ]Tab]) • ]Tab]). □ 

5. Single-objective preference relations 

In this section we will be interested in finite games with single-objective preference relations. 

The value problem for finite concurrent games with oi-regular objectives has standard 
solutions in game theory; they are given in Table El (page [3D. Let us briefly give some 
explanations. Most of the basic literature on two-player games focus on turn-based games, 
and in particular algorithms for solving two-player games with oi-regular objectives only 
deal with turn-based games (see for instance [231 Chapter 2]). In particular, McNaughton 
developed an algorithm to solve turn-based parity games in time 0(]States] • ]Edg]P“^), 
where p — 1 is the number of priorities [32] . Biichi games and co-Biichi games correspond 
to parity games with two priorities, hence they are solvable in polynomial time. Similarly 
reachability games and safety games can be transformed into Biichi games by making the 
target states absorbing. Hence turn-based game with these types of objectives can be solved 
in polynomial time. 

Note however that we can reuse these algorithms in the concurrent case as follows. 
Any finite concurrent zero-sum game with objective for player Ai can be transformed 
into a turn-based zero-sum game with objective for player Ai: the idea is to replace 
any edge labelled with pair of actions ( 01 , 02 ) into two consecutive transitions labelled with 
oi (belonging to player Ai) and with 02 (belonging to player A 2 ). Furthermore H is an 
cj-regular condition, then so is H, and the type of the objective (reachability, Biichi, etc) 
is preserved (note however that this transformation only preserves Player Ai objective). 
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Hence the standard algorithm on the resulting turn-based game can be applied. Lower 
bounds for reachability/safety and Biichi/co-Biichi games are also folklore results, and can 
be obtained by encoding the circuit-value problem (we recall the encoding in Section I5.3.3P . 

We now focus on the NE existence problem and on the constrained NE existence prob¬ 
lem when each player has a single (w-regular) objective using the suspect game construction. 
The results are summarised in the second column of Table [TJ 

Streett and Muller objectives are not explicitly mentioned in the rest of the section. The 
complexity of their respective (constrained) NE existence problems, which is given in Tabledl 
can easily be inferred from other ones. The Py'^-hardness for the NE existence problem with 
Streett objectives follows from the corresponding hardness for parity objectives (parity 
objectives can be encoded efficiently as Streett objectives). Hardness for the NE existence 
problem in Muller games, is deduced from hardness of the value problem (which holds for 
turn-based games), applying Proposition 13.41 Eor both objectives, membership in PSPACE 
follows from PSPACE membership for objectives given as Boolean circuits, since they can 
efficiently be encoded as Boolean circuits. 

We hx for the rest of the section a multi-player finite game Q = (States, Agt, Act, Mov, 
Tab, (;^A)AeAgt)) and we assume that each is single-objective, given by set 

Remark 5.1. Let us come back to Remark 12.21 on our choice of an explicit encoding for 
the set of transitions. Assuming more compact encodings, the complexity of computing 
Nash equilibria for qualitative objectives does not allow to distinguish between the intrinsic 
complexity of the objectives. Indeed, in the formalism of m, the transition function is 
given in each state by a hnite sequence sq), {4>h, Sh)), where Sj € States, and (j)i is 
a boolean combination of propositions {A = m) that evaluates to true iff agent A chooses 
action m. The transition table is then defined as follows: Tab(s,mAgt) = Sj iff j is the 
smallest index such that (/>j evaluates to true when, for every player A G Agt, A chooses 
action ttt-a. It is required that the last boolean formula be T, so that no agent can 
enforce a deadlock. 

We can actually state the following result, whose proof is postponed to the Appendix 
on page [69l 

Proposition 5.2. For finite concurrent games with compact encoding of transition func¬ 
tions and with reachability/Biichi/safety objectives, the constrained NE existence problems 
is PSPACE-hard. 

Remark 5.3. It is first interesting to notice that given two plays tt and vr' the suspect games 
and 'H{G,'n'') only differ in their winning conditions. In particular, the structure of 
the game only depends on Q, and has polynomial size (see Proposition 14.8 p . We denote it 
with ff{G). Moreover, as each relation is given by a single objective the winning 
condition for Eve in FLiQ, tt) rewrites as: for every A € A(/3) nLos(7r), projfip) is losing (in Q) 
for player A, where Los(7r) is the set of players losing along vr in Q. This winning condition 
only depends on Los(7r) (not on the precise value of play vr). Therefore in this section, the 
suspect game is denoted with 'H{G,L), where L C Agt, and Eve wins play p if, for every 
A G X{p) Cl L, A loses along projfip) in Q. In many cases we will be able to simplify this 
winning condition, and to obtain simple algorithms to the corresponding problems. 

We now distinguish between the winning objectives of the players. There are some 
similarities in some of the cases (for instance safety and co-Biichi objectives), but they 
nevertheless all require specific techniques and proofs. 
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5.1. Reachability objectives. The value problem for a reachability winning condition is 
P-complete. Below, we design a non-deterministic algorithm that runs in polynomial time 
for solving the constrained NE existence problem. We then end this subsection with a NP- 
hardness proof of the constrained NE existence problem and NE existence problem. In the 
end, we prove the following result: 

Theorem 5.4. For finite concurrent games with single reachability objectives, the NE ex¬ 
istence problem and the constrained NE existence problem are HP-complete. 


5.1.1. Reduction to a safety game. We assume that for every player A, Qa is a single 
reachability objective given by target set Ta. Given L C Agt, in the suspect game 'H{Q,L), 
we show that the objective of Eve reduces to a safety objective. We define the safety 
objective in ^.{Q, L) by the set = {(s, P) \ 3A € P PI L. s G Ta} of target states. 

Lemma 5.5. Eve has a winning strategy in game T-LiQjL) iff Eve has a winning strategy 
in game N{G) with safety objective 

Proof. We hrst show that any play in is winning in ^{G^L). Let p G GIl, and let 
A G X{p) n L. Toward a contradiction assume that Occ{proji{p)) GTa 0: there is a 
state {s,P) along p with s G Ta- Obviously \{p) C P, which implies that A G P D L. This 
contradicts the fact that p Gl^. We have shown so far that any winning strategy for Eve 
in J{G) with safety objective GLl is a winning strategy for Eve in ^{{G, L). 

Now assume that Eve has no winning strategy in game fI{G) with safety objective Gl^. 
Turn-based games with safety objectives being determined, Adam has a strategy a\/ which 
ensures that no outcome of uv is in GIl. If p G Out((Tv), there is a state (s,P) along p such 
that there is A G P H P with s G Ta- We now modify the strategy of Adam such that as 
soon as such a state is reached we switch from cry to the strategy that always obeys Eve. 
This ensures that in every outcome p' of the new strategy, we reach a state (s, P) such that 
there is A G P D L with s G Ta, and \{p') = P. This Adam’s strategy thus makes Eve lose 
the game 'H{G, L), and Eve has no winning strategy in game Fi{G, L). □ 


5.1.2. Algorithm. The algorithm for solving the constrained NE existence problem in a 
game where each player has a single reachability objective relies on Theorem 14.51 and Propo¬ 
sition EH and on the above analysis: 

(i) guess a lasso-shaped play p = ti ■ rf (with \Ti\ < 2|Statesp) in J{G), such that Adam 
obeys Eve along p, and vr = projfip) satishes the constraint on the payoff; 

(ii) compute the set W(G, Los(7r)) of states that are winning for Eve in the suspect game 
'P(^, Los(7r)), where Los(7r) is the set of losing players along tt; 

(hi) check that p stays in kE(^, Los(7r)). 

Eirst notice that this algorithm is non-deterministic and runs in polynomial time: the 
witness p guessed in step (i) has size polynomial; the suspect game ^(1/, Los('7r)) has also 


polynomial size (Proposition 021) j Step |(ii)| can be done in polynomial time using a standard 
attractor computation [23l Sect. 2.5.1] as the game under analysis is equivalent to a safety 

can obviously be performed in polynomial time. 

ensures 


game (Lemma [53]) i finally step (hi) 

Step ensures that conditions [2] and [H of Theorem 14.51 hold for p and step (hi) 


condition El 
tion l3.ll 


Correctness of the algorithm then follows from Theorem 14.51 and Proposi- 











PURE NASH EQUILIBRIA IN CONCURRENT DETERMINISTIC GAMES 


27 


5.1.3. Hardness. We prove NP-hardness of the constrained NE existence problem by encod¬ 
ing an instance of 3SAT as follows. We assume set of atomic propositions AP = {xi,... ,Xk}, 
and we let (j) = Ar=iwhere Cj = V A ,2 V A ,3 where iij € {xk,^Xk | 1 < A: < p}- 
We build the turn-based game with n + 1 players Agt = {A, Ci,..., Cn} as follows: for 
every 1 < k < p, player A chooses to visit either location x^ or location -'X^. Location x^ 
is winning for player Ci if, and only if, x^ is one of the literals in Cj, and similarly location 
-■Xfc is winning for Ci if, and only if, is one of the literals of Cj. The construction is 
illustrated on Figure [TTl with the reachability objectives dehned as Cla = {A,i) A, 2 ,^, 3 } for 
1 < z < n. Now, it is easy to check that this game has a Nash equilibrium with payoff 1 for 
all players {Ci)i<i<n if) and only if, cp is satishable. 

We prove hardness for the NE existence problem by using the transformation described 
in Section 13.41 once for each player. We dehne the game Go similar to Q but with an 
extra player Cn+i who does not control any state for now. For 1 < z < n, we dehne 
Gi = E{Gi-i,Ci,Cn+i, p), where /? is a winning path for Ci. The preference relation can 
be expressed in any Gi by a reachability condition, by giving to Cn+i a target which is the 
initial state of G- According to Proposition 13.51 there is a Nash equilibrium in Gi if, and 
only if, there is one in Gi-i where Ci wins. Therefore there is a Nash equilibrium in Gn if) 
and only if, (p is satishable. This entails NP-hardness of the NE existence problem. 



Figure 11. Reachability game for the reduction of 3SAT 


5.2. Safety objectives. The value problem for safety objectives is P-complete. We next 
show that the constrained NE existence problem can be solved in NP, and conclude with 
NP-hardness of both the constrained NE existence problem and the NE existence problem. 
We hence prove: 

Theorem 5.6. For finite games with single safety objectives, the NE existenee problem and 
the constrained NE existence problem are HP-complete. 


5.2.1. Reduction to a conjunction of reachability objectives. We assume is a single safety 
objective given by set Ta. In the corresponding suspect game, we show that the goal of 
Eve is equivalent to a conjunction of reachability objectives. Let L C Agt. In suspect game 
'H{G,L), we dehne several reachability objectives as follows: for each A L, we dehne 
Tfi = Ta X {P \ P G Agt} U States x {P \ A ^ P}, and we write for the corresponding 
reachability objectives. 

Lemma 5.7. A play p is winning for Eve in HiG, L) iff p G C\a£L ^'a- 

Proof. Let p be a play in 'H(G,L), and assume it is winning for Eve. Then, for each 
A € A(p) r\ L, p ^ Ga, which means that the target set Ta is visited along proji{p), and 
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therefore is visited along p. If ^4 ^ then a state {s,P) with ^ ^ P is visited by p: 
the target set is visited. This implies that p E Hagl 

Conversely let p € Hagl every A £ L, is visited by p. Then, either Ta is 

visited by proji(p) (which means that p ^ or A ^ ^{p)- In particular, p is a winning 
play for Eve in T-L{G, L). □ 

5.2.2. Algorithm for solving finite zero-sum turn-based games with a eonjunetion of reaeh- 
ability objeetives. We now give a simple algorithm for solving zero-sum games with a con¬ 
junction of reachability objectives. This algorithm works in exponential time with respect 
to the size of the conjunction (we will see in Subsection 17.1.61 that the problem is PSPACE- 
complete). However for computing Nash equilibria in safety games we will only use it for 
small (logarithmic size) conjunctions. 

Let ^ be a two-player turn-based game with a winning objective for Eve given as a 
conjunction of k reachability objectives fli,..., We assume vertices of Eve and Adam in 
G are Xfi and Vy respectively, and that the initial vertex is vo- The idea is to construct a 
new game G that remembers the objectives that have been visited so far. The vertices of 
game G' controlled by Eve and Adam are = V 3 x and = Vy x respectively. 
There is a transition from {v,S) to {v',S') iff there is a transition from v to v' in the 
original game and S" = 5* U {i | u' E fli}. The reachability objective Gl for Eve is given by 
target set States x |l,/i:]. It is clear that there is a winning strategy in G from vq for the 
conjunction of reachability objectives Qi,... ,Glk iff there is a winning strategy in game G' 
from (uo, {f I uo E fij}) for the reachability objective 12. The number of vertices of this new 
game is U = IV 3 U Vyj -2^, and the size of the new transition table Tab' is bounded 
by I Tab I • 2 ^, where Tab is the transition table of G- An attractor computation on G' is then 
done in time 0(|I^' U Iy| • |Tab'|), we obtain an algorithm for solving zero-sum games with 
a conjunction of reachability objectives, running in time ■ (IV 3 U Vyj • |Tab|)). 

5.2.3. Algorithm. The algorithm for solving the constrained NE existence problem for single 
reachability objectives could be copied and would then be correct. It would however not 
yield an NP upper bound. We therefore propose a refined algorithm: 

(i) guess a lasso-shaped play p = ti ■ rf (with |rj| < | Statesp) in J{G) such that Adam 
obeys Eve along p, and tt = proji{p) satisfies the constraint on the payoff. Note 
that if Los( 7 r) is the set of players losing in tt, computing IT(t/, Los( 7 r)) would require 
exponential time. We will avoid this expensive computation. 

(ii) check that any Adam-deviation along p, say at position i (for any i), leads to a state 
from which Eve has a strategy to ensure that any play in p<j • Out((T(j) is winning 
for her. 

Step (ii) can be done as follows: pick an Adam-state (s,Agt,mAgt) along p and a successor 
{t,P) such that t 7 ^ Tab(s,mAgt); we only need to show that {t,P) E H7(^, (Los( 7 r) \ 
Los(p<i))nP). We can compute this set efficiently (in polynomial time) using the algorithm 
of the previous paragraph since 2l^l < |Tab| (using the same argument as in ProDosition l4.8ll . 

This non-deterministic algorithm, which runs in polynomial time, precisely implements 
Theorem 14.51 and therefore correctly decides the constrained NE existence problem. 
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Figure 12. Extending game Q^j, with final concurrent modules 

5.2.4. Hardness. The NP-hardness for the constrained NE existence problem can be proven 
by encoding an instance of 3SAT using a game similar to that for reachability objectives, 
see Section EH We only change the constraint which is now that all players Ci should be 
losing, and we get the same equivalence. 

The reduction of Lemma [3.41 cannot be used to deduce the hardness of the NE existence 
problem, since it assumes a lower bound on the payoff. Here the constraint is an upper bound 
(“each player should be losing”). We therefore provide an ad-hoc reduction in this special 
case, which is illustrated on Figure [T2j We add some module at the end of the game to 
enforce that in an equilibrium, all players are losing. We add concurrent states between A 
and each Ci (named AjCi). All players Ci are trying to avoid t, and A is trying to avoid u. 

Since A has no target in she cannot lose before seeing u, and then she can always 
change her strategy in the concurrent states in order to go to t. Therefore an equilibrium 
always ends in t. A player Ci whose target was not seen during game can change her 
strategy in order to go to u instead of t. That means that if there is an equilibrium, there 
was one in where all Ci are losing. Conversely, if there was such an equilibrium in 
we can extend this strategy profile by one whose outcome goes to t and it is an equilibrium 
in the new game. This concludes the NP-hardness of the NE existence problem. 

5.3. Biichi objectives. The value problem for Biichi objectives is P-complete. In this 
subsection we design a polynomial-time algorithm for solving the constrained NE existence 
problem for Biichi objectives. The P-hardness of the NE existence problem can then be in¬ 
ferred from the P-hardness of the value problem, applying ProDositions l3.2l and l3.4l Globally 
we prove the following result: 

Theorem 5.8. For finite games with single Biichi objectives, the NE existence problem and 
the constrained NE existence problem are P-complete. 


5.3.1. Reduction to a co-Biichi game. We assume that for every player A, Qa is a Biichi 
objective given by target set T^. Given L C Agt, in the suspect game H-iG, L), we show that 
the objective of Eve is equivalent to a single co-Biichi objective. We define the co-Biichi 
objective GIl in T~L{G, L) given by the target set = {(s, P) \ 3A G Pn L. s G Ta}- Notice 
that the target set is defined in the same way as for reachability objectives. 

Lemma 5.9. A play p is winning for Eve in H{G, L) iff p ^ Gl^. 
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Proof. Assume that p is winning for Eve in 'H{G,L). Then for every A G X{p) D L, it holds 
lni{proji{p)) n = 0. Toward a contradiction, assume that Inf(/9) PTl ^ 0 . There exists 
{s,P) such that there is A G P n L with s G Ta, which appears infinitely often along p. 
In particular, P = \{p) (otherwise it would not appear infinitely often along p). Hence, 
we have found A G \{p) n L such that lni{proji{p)) n Ta 7 ^ 0, which is a contradiction. 
Therefore, p € CIl- 

Assume p ^ PIl'- for every (s, P) such that there exists A G P fl L with s G Ta, (s, P) 
appears finitely often along p. Let A G X{p) n L, and assume towards a contradiction that 
there is s G Ta such that s appears infinitely often along proji(p). This means that (s, X{p)) 
appears infinitely often along p, which contradicts the above condition. Therefore, p is 
winning for Eve in PL{Q,L). □ 

5.3.2. Algorithm. As for reachability objectives, the winning region for Eve in 'H{G,L) can 
be computed in polynomial time (since this is the winning region of a co-Biichi game, see 
Lemma 15.91 abovel. A non-deterministic algorithm running in polynomial time similar to 
the one for reachability objectives can therefore be inferred. However we can do better than 
guessing an appropriate lasso-shaped play p by looking at the strongly connected compo¬ 
nents of the game: a strongly connected component of the game uniquely defines a payoff, 
which is that of all plays that visit infinitely often all the states of that strongly connected 
component. Using a clever partitioning of the set of strongly connected components of the 
game, we obtain a polynomial-time algorithm. 

From now on and until the end of Subsection 15.3.21 we relax the hypotheses on the 
preference relations (that they are all single-objective with a Biichi condition). We present 
an algorithm in a more general context, since the same techniques will be used in Subsec¬ 
tion [622] (and we chose to only present once the construction). For the rest of this sub¬ 
section we therefore make the following assumptions on the preference relations (;^A)AGAgt- 
For every player A G Agt: 

(a) ;^A only depends on the set of states which is visited infinitely often: if p and p' are 
two plays such that Inf(p) = Inf(p') then p p' and p' t^a P', 

(b) Pa is given by an ordered objective wa with preorder <a, and <a is supposed to be 
monotonic; 

(c) for every threshold w^, we can compute in polynomial time C States such that 
Inf(p) C 4^ ppA w^. 

Obviously preferences given by single Biichi objectives do satisfy those hypotheses. At every 
place where it is relevant, we will explain how the particular case of single Biichi objectives 
is handled. Next we write (★) for the above assumptions, and (A)a (resp. (*);,, (^jc) for only 
the first (resp. second, third) assumption. 

We hrst characterise the ‘good’ plays in fI{Q) in terms of the strongly connected com¬ 
ponents they define: the strongly connected component defined by a play is the set of 
states that are visited infinitely often by the play. We fix for each player A, equivalence 
classes of plays and w^, that represent lower- and upper-bounds for the constrained 
NE existence problem. Both can be represented as finite sets, representing the set of states 
which are visited infinitely often. For each K C States, we write v^{K) for the equiva¬ 
lence class of all paths vr that visits infinitely often exactly K, i.e.: Inf(7r) = K. We also 
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write v{K) = {v^{K)) a&A gt- We look for a transition system {K,E), with K C States and 
E C K X K, for which the following properties hold: 

(1) <A v^{K) <yi for all A € Agt; 

(2) (K, E) is strongly connected; 

(3) Vfc G K. (fc, Agt) G W{G,v{K))] 

(4) \/{k,k') G E. 3(/c, Agt,mAgt) G W{Q,v{K)). Tab(fc,mAgt) = k'] 

(5) {K X {Agt}) is reachable from (s,Agt) in W{G,v{K))] 

where W{Q,v{K)) is the winning region of Eve in suspect game 1-L{Q 

If one can find one such transition system (AT, A), then we will be able to build a lasso- 
play p from (s, Agt) in the suspect-game that will satisfy the conditions of Theorem 14.51 
Formally, we have the following lemma: 

Lemma 5.10. Under hypothesis {*)a, there is a transition system {K,E) satisfying condi- 
tzons [2H3 if, and only if, there is a path p from (s, Agt) in 'H{G,v{K)) that never gets out 
of W{Q,v{K)), along which Adam always obeys Eve, <a v^{K) <a for all A G Agt, 
and proji{lnf{p) D V 3 ) = K (which implies that p G v^{K) for all A). 

Proof. The first implication is shown by building a path in W{G,v{K)) that successively 
visits all the states in AT x {Agt} forever. Thanks to O [2] and 0] (and the fact that Adam 
obeys Eve), such a path exists, and from [3] and 01 this path remains in the winning region. 
Fromlll we have the condition on the preferences. Conversely, consider such a path p, and let 
K = proii{hii{p)r\V^) and A = {{k,k') G \ 3(A;, Agt, m-Agt) G Inf(/9). Tab(A:, rriAgt) = k'}. 
Condition [5] clearly holds. Conditions [H [3] and 0] are easy consequences of the hypotheses 
and construction. We prove that {K, E) is strongly connected. First, since Adam obeys Eve 
and p starts in [k, Agt), we have \{p) = Agt. Now, take any two states k and k' in K: then 
p visits {k, Agt) and (A;', Agt) infinitely often, and there is a subpath of p between those 
two states, all of which states appear infinitely often along p. Such a subpath gives rise to 
a path between k and k', as required. □ 

As a consequence, if (AT, E) satisfies the five previous conditions, by Theorem 14.51 there 
is a Nash equilibrium whose outcome lies between the bounds and w^. Our aim is to 
compute efficiently all maximal pairs {K, E) that satisfy the five conditions. 

To that aim we define a recursive function SSG (standing for “solve sub-game”), working 
on transition systems, that will decompose efficiently any transition system that does not 
satisfy the five conditions above into polynomially many disjoint sub-transition systems via 
a decomposition into strongly connected components. 

• if AT X {Agt} C W{Q,v{K)), and if for all {k,k') G E there is a (A:, Agt, m Agt) in 
W{G,v{K)) s.t. Tab(A:, ruAgt) = k', and finally if {K,E) is strongly connected, then we 
set SSG{{K,E)) = {(AT, A)}. This means that conditions (2)-(4) are satisfied by {K,E). 

• otherwise, we let 

SSG((A,A))= IJ SSG(r((A',A'))) 

{K',E')<£SCC{{K,E)) 

where SCC((A, A)) is the set of strongly connected components of {K,E) (which can be 
computed in linear time), and where T[{K',E')) is the transition system whose set of 

^Formally the suspect game has been defined with a play as reference, and not a equivalence class. 
However, in this subsection, if tt and tt' are equivalent, the games H(Q,n) and PIQjTv') are identical. 
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states is {k ^ K' \ {k,Agt) G W(ff,v(K'))} and whose set of edges is 

{(k,k') G E' I 3(A:, Agt,mAgt) € W{Q,v{K')). Tab(A:,mAgt) = k'}. 

Notice that this set of edges is never empty, but T{{K',E')) might not be strongly con¬ 
nected anymore, so that this is really a recursive definition. 

The recursive function SSG decomposes any (sub-)transition system of the game into a list 
of disjoint transition systems which all satisfy conditions (2)-(4) above. 

So far the computation does not take into account the bounds for the payoffs of the 
players (lower bound and upper bound for player A). For each upper bound 
we assume condition (*)c holds . In the particular case of a single Biichi objective for each 
player define by target Ta, this is simply done by setting = States \ Fa, if this player 
has to be losing (that is, if does not satisfy the Biichi objective). Now assuming we 
have found the appropriate set we define 

Sol = SSG(( f| S^,m^))r\{{K,E)\yAGkgi.u^<v^{K)] 

AeAgt 

where Edg^ restricts Edg to flAeAgt 

We now show that the set Sol computes (in a sense that we make clear) the transition 
systems that are mentioned in Lemma iB.lOj) . 

Lemma 5.11. We suppose condition (a) holds. If {K,E) G Sol then it satisfies conditions\^ 
to\^ Conversely, if {K,E) satisfies conditions Ul tothen there exists {K',E') G Sol suck 
that {K,E) C {K',E'). 

Proof. Let {K,E) G Sol. By definition of SSG, all (fe, Agt) iox k G K are in W{Q,v{K)), and 
for all {k, k') G E, there is a state {k, Agt, ruAgt) in W{Q, v{K)) such that Tab(A:, mAgt) = k', 
and {K,E) is strongly connected. Also, for all A, < v^{K) because Sol C {{K,E) \ 
Finally, for any A G Agt, v^{K) < because the set K is included in S^. 
Conversely, assume that {K, E) satisfies the conditions. We show that if {K, E) C 
{K',E') then there is {K'',E'') in SSG{{K', E')) such that {K,E) C {K",E"). The proof is 
by induction on the size of {K',E'). 

The basic case is when {K', E') satisfies the conditions [2l [3l and 01 in that case, 
SSG{{K',E')) = {{K',E')}, and by letting {K",E'') = {K',E') we get the expected re¬ 
sult. 

We now analyze the other case. There is a strongly connected component of {K', E'), say 
{K",E"), which contains {K,E), because {K,E) satisfies condition [2l We have v^{K) <a 
v^{K") (because K C K” and <a is monotonic) for every A, and thus W{Q,v{K)) C 
W(Q, v(K")). This ensures that T{{K", E'')) contains (A, E) as a subgraph. Since (A", A") 
is a subgraph of {K',E'), the graph T{{K", E")) also is. We show that they are not equal, 
so that we can apply the induction hypothesis to T{{K", E")). For this, we exploit the fact 
that {K',E') does not satisfy one of conditions [2] to 01 

• first, if {K',E') is not strongly connected while {K",E") is, they cannot be equal; 

• if there is some k G K' such that {k, Agt) is not in W{Q, v{K')), then k is not a vertex of 
T{{K",E"))- 

• if there some edge {k, k') in E' such that there is no state (fe, Agt, ruAgt) in W[Q,v{K')) 
such that Tab(/c, ruAgt) = k', then the edge (/c, k') is not in T{{K'', E")). 

We then apply the induction hypothesis to T{{K",E")), and get the expected result. Now, 
because of condition01 < v^{K) < w^. Hence, due to the previous analysis, there exists 
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{K',E') G SSG 5^, Edg')^ such that {K,E) C {K',E'). This concludes the proof 

of the lemma. □ 

Lemma 5.12. Under assumptions {*), if for every K, the set W{Q,v{K)) can be computed 
in polynomial time, then the set Sol can also be computed in polynomial time. 

Proof. Each recursive call to SSG applies to a decomposition in strongly connected compo¬ 
nents of the current transition system under consideration. Hence the number of recursive 
calls is bounded by |Statesp. Computing the decomposition in SCCs can be done in linear 
time. By assumption, each set W{Q,v{K)) can be computed in polynomial time. is 
obtained by removing the target of the losers (for w^) from States. Hence globally we can 
compute Sol in polynomial time. □ 

To conclude the algorithm, we need to check that condition [5] holds for one of the 
solutions {K, E) in Sol. It can be done in polynomial time by looking for a path in the 
winning region of Eve in Pi{Q, v{K)) that reaches K x {Agt} from (s, Agt). The correctness 
of the algorithm is ensured by the fact that if some {K, E) satisfies the five conditions, there 
is a {K', E') in Sol with K C K' and E C E'. Since K C K' implies v^{K) <a v^{K'), the 
winning region of Eve in Pi{Q,v{K')) is larger than that 'H{Q,v{K')), which implies that 
the path from (s,Agt) to AT x {Agt} is also a path from (s,Agt) to K' x {Agt}. Hence, 
{K',E') also satisfies condition [5l and therefore the hve expected conditions. 

We have already mentioned that single Biichi objectives do satisfy the hypotheses (a). 
Eurthermore, Lemma 15.91 shows that, given v{K), one can compute the set W{Q,v{K)) as 
the winning region of a co-Biichi turn-based game, which can be done in polynomial time 
(this is argued at the beginning of the section). Therefore Lemma 15.121 and the subsequent 
analysis apply: this concludes the proof that the constrained NE existence problem for finite 
games with single Biichi objectives is in P. 

5.3.3. Hardness. We recall a possible proof of P-hardness for the value problem, from which 
we will infer the other lower bounds. The circuit-value problem can be easily encoded into 
a deterministic turn-based game with Biichi objectives: a circuit (which we assume w.l.o.g. 
has only AND- and OR-gates) is transformed into a two-player turn-based game, where one 
player controls the AND-gates and the other player controls the OR-gates. We add self-loops 
on the leaves. Positive leaves of the circuit are the (Biichi) objective of the OR-player, and 
negative leaves are the (Biichi) objective of the AND-player. Then obviously, the circuit 
evaluates to true iff the OR-player has a winning strategy for satisfying his Biichi condition, 
which in turn is equivalent to the fact that there is an equilibrium with payoff 0 for the 
AND-player, by Proposition 13.21 We obtain P-hardness for the NE existence problem, using 
Proposition 13.41 the preference relations in the game constructed in Proposition 13.41 are 
Biichi objectives. 

5.4. Co-Biichi objectives. The value problem for co-Biichi objectives is P-complete. We 
now prove that the constrained NE existence problem is in NP, and that the constrained 
NE existence problem and the NE existence problem are NP-hard. We therefore deduce: 

Theorem 5.13. For finite games with single co-Biichi objectives, the NE existence problem 
and the constrained NE existence problem are HP-complete. 
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The proof of this Theorem is very similar to that for safety objectives: instead of 
conjunction of reachability objectives, we need to deal with conjunction of Biichi objectives. 
Of course constructions and algorithms need to be adapted. That is what we present now. 

5.4.1. Reduction to a conjunction of Biichi conditions. We assume that for every player A, 
0/1 is a single co-Biichi objective Qa given by Ta- In the corresponding suspect game, we 
show that the goal of player Eve is equivalent to a conjunction of Biichi objectives. Let 
L C Agt. In suspect game T-i{Q,L), we define several Biichi objectives as follows: for each 
A & L, we define = T/i x {P | P C Agt} U States x {P | A ^ P}, and we write 0'^ for 
the corresponding Biichi objective. 

Lemma 5.14. A play p is winning for Eve in V-iG, L) iff p € Hagl ^a- 

Proof. Let p be a play in 1-1(0, L), and assume it is winning for Eve. Then, for each 
A £ X(p) f] L, p ^ Qa, which means that the target set Ta is visited along proji(p), and 
therefore is visited infinitely often along p. If A ^ ^(p)) then a state (s, P) with A ^ P 
is visited infinitely often by p: the target set is visited infinitely often. This implies that 

P ^ Hagl ^a- 

Conversely let p € PlAeL ^A- ^ ^ L,T'^ is visited infinitely often by p. Then, 

either Ta is visited infinitely often by proji(p) (which means that p ^ GIa) or A ^ ^(p)- In 
particular, p is a winning play for Eve in li(Q,L). □ 

5.4.2. Algorithm for solving zero-sum games with a conjunction of Biichi objectives. We 
adapt the algorithm for conjunctions of reachability objectives (page 1281 ) to conjunctions 
of Biichi objectives. Let ^ be a two-player turn-based game with a winning objective for 
Eve given as a conjunction of Biichi objectives fli,..., Glk- The idea is to construct a new 
game Q' which checks that each objective is visited infinitely often. The vertices of Q' 
controlled by Eve and Adam are = V 3 X [0, fe] and = Vy x |0, fc] respectively. There is a 
transition from (v, k) to (v', 0 ) iff there is a transition from v to v' in the original game and 
for 0 < i < fc, there is a transition from (v, i) to (v', i -\-V) iff there is a transition from v to 
v' in the original game and v' € Hi+i. In Q', the objective for Eve is the Biichi objective 12 
given by target set States x {k}, where States = V 3 U Vy is the set of vertices of Q. It is 
clear that there is a winning strategy in Q from vq for the conjunction of Biichi objectives 
Qi,..., rifc iff there is a winning strategy in Q' from (uq, 0) for the Biichi objective 12. The 
number of states of game G' is |States'| = |States] • k, and the size of the transition table 
|Tab'| = |Tab| • k. Using the standard algorithm for turn-based Biichi objectives [13], which 
works in time 0(1 States'] • ]Tab']), we obtain an algorithm for solving zero-sum games with a 
conjunction of Biichi objectives running in time 0(k‘^ ■ ]States] • ]Tab]) (hence in polynomial 
time). 

5.4.3. Algorithm. The algorithm is the same as for reachability objectives. Only the com¬ 
putation of the set of winning states in the suspect game is different. Since we just showed 
that this part can be done in polynomial time, the global algorithm still runs in (non- 
deterministic) polynomial time. 
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Figure 13. Module where cj) = ci A ■■■ Acn and c* = 4,i V £i^2 V £^,3 

5.4.4. Hardness. The hardness result for the constrained NE existence problem with co- 
Biichi objectives was already proven in [40]. The idea is to encode an instance of 3SAT into 
a game with co-Biichi objectives. For completeness we describe the reduction below, and 
explain how it can be modified for proving NP-hardness of the NE existence problem. 

Let us consider an instance </> = ci A • • • A c„ of SAT, where Cj = V £*^2 V £*^ 3 , and 
£ij € {xk,^Xk I 1 < fc < p}. The game Q is obtained from module depicted on 

Figure [131 by joining the outgoing edge of Cn+i to ci. Each module M{(p) involves a set of 
players Bk, one for each variable Xk, and a player Ai. Player Ai controls the clause states. 
Player control the literal states iij when iij = —^Xk, then having the opportunity to go 
to state T. There is no transition to T for literals of the form Xk- In assuming that 

the players Bk will not play to T, then Ai has a strategy that does not visit both Xk and —^Xk 
for every k if, and only if, formula (j) is satisfiable. Finally, the co-Biichi objective of Bk is 
given by {x^}. In other terms, the aim of Bk is to visit Xk only a finite number of times. 
This way, in a Nash equilibrium, it cannot be the case that both Xk and ^Xk are visited 
infinitely often: it would imply that Bk loses but could improve her payoff by going to T 
(actually, -iXfc should not be visited at all if Xk is visited infinitely often). Therefore setting 
the objective of Ai to {T}, there is a Nash equilibrium where she wins iff (p is satisfiable. 
This shows NP-hardness for the constrained NE existence problem. 

For the NE existence problem, we use the transformation described in Section 13.41 We 
add an extra player A 2 to Q and consider the game G' = E{Q, Ai,A 2 ,p), where p is a winning 
path for Ai. The objective of the players in Q' can be described by co-Biichi objectives: 
A 2 has to avoid seeing T = {si} infinitely often and keep the same target for Ai. Applying 
Proposition 13.51 there is a Nash equilibrium in G' if, and only if, there is one in G where 
Ai wins, this shows NP-hardness for the NE existence problem. 

5.5. Objectives given as circuits. The value problem is known to be PS PACE-complete 
for turn-based games and objectives given as circuits [27] . The transformation presented in 
the beginning of the section can be used to decide the value problem for finite concurrent 
games with a single circuit-objective, yielding PSPACE-completeness of the value problem 
in the case of finite concurrent games as well. 

We now show that the (constrained) NE existence problem is also PS PACE-complete in 
this framework: 

Theorem 5.15. For finite games with single objectives given as circuits, the NE existence 
problem and the constrained NE existence problem are PSP ACE-complete. 
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5.5.1. Reduction to a circuit objective. We assume the preference relation of each player 
A G Agt is given by a circuit Ca- Let L C Agt. We dehne a Boolean circuit dehning the 
winning condition of Eve in the suspect game 'H{Q,L). 

We dehne for each player A G Agt and each set P of players (such that States x P is 
reachable in 'P(^, L)), a circuit Da,p which outputs true for the plays p with X{p) = P (i.e. 
whose states that are visited inhnitely often are in States x {P}), and whose value by Ca is 
true. We do so by making a copy of the circuit Ca, adding |States| OR gates gi - ■ ■ (7|states| 
and one AND gate h. There is an edge from (sj, P) to pi and from to gi\ii< |States| 
then there is an edge from the output gate of Ca to h and from h to the output gate of 
the new circuit. Inputs of Ca are now the (s,P)’s (instead of the s’s). The circuit Da,p is 
given on Figure [TH 



Figure 14. Circuit Da,p 

We then dehne a circuit Ea which outputs true for the plays p with A G X{p) and 
whose output by Ca is true. We do so by taking the disjunction of the circuits Da,p. 
Formally, for each set of players P such that States x P is reachable in the suspect game 
and A G P, we include the circuit Da,p and writing oa,p for its output gate, we add OR 
gates so that there is an edge from oa,p to g^ and from pi to Pi+i, and then from Pn+i to 
the output gate. 

Finally we dehne the circuit Fp, which outputs true for the plays p such that there is 
no A G L such that A G X{p) and the output of proji{p) by Ca is true. This corresponds 
exactly to the plays that are winning for Eve in suspect game T-i{G, L). We do so by negating 
the disjunction of all the circuits Ea for A ^ L. 

The next lemma follows from the construction: 

Lemma 5.16. A play p is winning for Eve in T-L{G, L) iff p evaluates circuit Fp to true. 

We should notice that circuit Fp has size polynomial in the size of G, thanks to Propo¬ 
sition 14.81 

5.5.2. Algorithm and complexity analysis. To solve the constrained NE existence problem 
we apply the same algorithm as for reachability objectives (see section 1^. For complexity 
matters, the only difference stands in the computation of the set of winning states in the 
suspect game. Thanks to Lemma 15. 161 we know it reduces to the computation of the set of 
winning states in a turn-based game with an objective given as a circuit (of polynomial-size). 
This can be done in PSPACE |27] . which yields a PSPACE upper bound for the constrained NE 
existence problem (and therefore for the NE existence problem and the value problem - see 
Proposition 18.21) . PSPACE-hardness of all problems follows from that of the value problem 
in turn-based games [271, and from Propositions 13.21 and 13.41 (we notice that the preference 
relations in the new games are easily definable by circuits). 
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5.6. Rabin and parity objectives. The value problem is known to be NP-complete for 
Rabin conditions [18] and in UPH co-UP for parity conditions [28] . 

We then notice that a parity condition is a Rabin condition with half as many pairs as 
the number of priorities: assume the parity condition is given by p: States i—>■ [0,d] with 
d G N; take for i in |0, ^], Qi = p~^{2i} and Ri = p~^{2j + I \ j > i}. Then the Rabin 
objective {Qi, Ri)Q^^^d is equivalent to the parity condition given by p. 

We design an algorithm that solves the constrained NE existence problem in Py'^ for 
Rabin objectives (see footnote [1] on page 0] for an informal definition of 

Our algorithm heavily uses non-determinism (via the oracle). We then propose a deter¬ 
ministic algorithm which runs in exponential time, but will be useful in Section 15.71 This 
subsection ends with proving Py'^-hardness of the constrained NE existence problem and 
NE existence problem for parity objectives. In the end, we will have proven the following 
theorem: 

Theorem 5.17. For finite games with single objectives given as Rabin or parity eonditions, 
the NE existence problem and the eonstrained NE existence problem are -complete. 


5.6.1. Reduction to a Streett game. We assume that the preference relation of each player 
A € Agt is given by the Rabin condition (Qi,A, Ri,A)ieii,kA]- ^ ^ In the suspect 
game RiQ, L), we define the Streett objective (Q'^^, R'i^^)i^ii^kAlAeL, where = {Qi,A x 
{P I A G P}) U (States x {P \ A ^ P}) and R{ ^ = Ri,A x {P | A G P}, and we write 12^ 
for the corresponding set of winning plays. 

Lemma 5.18. A play p is winning for Eve in 'H{Q, L) iff p £ 

Proof. Assume p is winning for Eve in P(0, L). Eor all A G X{p)riL, proji{p) does not satisfy 
the Rabin condition given by {Qi^A, all 1 < i < kA, lni{proji{p))riQi^A = 0 

or lni{proji{p)) PI Ri^A / 0 - We infer that for all 1 < i < kA, Inf(/o) D Q'^ = 0 or 
Inf(p) nP( ^ 7^ 0. Now, if A ^ X{p) then all Q'- ^ are seen infinitely often along p. Therefore 
for every A ^ L, the Streett conditions {Q'iAi^'iA) satished along p (that is, p G Hl). 

Conversely, if the Streett condition {Q'i A)i&li,kA\,A&L is satisfied along p, then 
either the Rabin condition {Qi^A,Ri,A) is not satisfied along projfip) or A 0 X{p). This 
means that Eve is winning in T-L{G, L). □ 

5.6.2. Algorithm. We now describe a Py'^ algorithm for solving the constrained NE existence 
problem in games where each player has a single Rabin objective. As in the previous cases, 
our algorithm relies on the suspect game construction. 

Write V for the set of sets of players of Agt that appear as the second item of a state 

of j{g)-. 

V = {P C Agt I 3s G States. (s,P) is a state of J{Q)}. 

Since J{G) has size polynomial, so has V. Also, for any path p, X{p) is a set of P. Hence, 
for a fixed L, the number of sets X{p) n P is polynomial. Now, as recalled on page (2^ 
the winning condition for Eve is that the players in X{p) fl L must be losing along proji{p) 
in G for their Rabin objective. We have seen that this can be seen as a Streett objective 
(Lemma I5.18P . 
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Now, deciding whether a state is winning in a turn-based game for a Streett condition 
can be decided in coNP [18]. Hence, given a state s € States and a set L, we can decide 
in coNP whether s is winning for Eve in 'H{G,L). This will be used as an oracle in our 
algorithm below. 

Now, pick a set P C Agt of suspects, i.e., for which there exists (s, t) G States^ and ruAgt 
s.t. P = Susp((s, t), mAgt)- Using the same arguments as in the proof of Proposition 14.81 
it can be shown that < |Tab|, so that the number of subsets of P is polynomial. Now, 
for each set P of suspects and each L P P, write w{L) for the size of the winning region 
of Eve in 'H{Q,L). Then the sum Ylp£V\{Agt} ^Lcp'^i^) most |States| x |Tabf. 

Assume that the exact value M of this sum is known, and consider the following algo¬ 
rithm: 

(1) for each P PV \ {Agt} and each L P P, guess a set W{L) P States, which we intend 
to be the exact winning region for Eve in T-L{G, L). 

(2) check that the sizes of those sets sum up to M; 

(3) for each s ^ W{L), check that Eve does not have a winning strategy from s in ^{GjL). 
This can be checked in non-deterministic polynomial time, as explained above. 

(4) guess a lasso-shaped path p = vr • in 'H{G,L) starting from (s,Agt), with |7r| and 
|r| less than |States]^ (following Proposition 13.ip visiting only states where the second 
item is Agt. This path can be seen as the outcome of some strategy of Eve when Adam 
obeys. For this path, we then check the following: 

• along p, the sets of winning and losing players satisfy the original constraint (remem¬ 
ber that we aim at solving the constrained NE existence problem); 

• any deviation along p leads to a state that is winning for Eve. In other terms, pick a 
state h = (s, Agt, ruAgt) of Adam along p, and pick a successor h' = {t,P) of h such 
that t 7 ^ Tab(s,mAgt)- Then the algorithm checks that t G IU(L D P). 

The algorithm accepts the input M if it succeeds in Ending the sets W and the path p 
such that all the checks are successful. This algorithm is non-deterministic and runs in 
polynomial time, and will be used as a second oracle. 

We now show that if M is exactly the sum of the w{L), then the algorithm accepts M 
if, and only if, there is a Nash equilibrium satisfying the constraint, i.e., if, and only if, Eve 
has a winning strategy from (s, Agt) in 'H{Q,L). 

First assume that the algorithm accepts M. This means that it is able, for each L, to 
find sets W(L) of states whose complement does not intersect the winning region of PiG, L). 
Since M is assumed to be the exact sum of w{L) and the size of the sets W{L) sum up 
to M, we deduce that W{L) is exactly the winning region of Eve in P{G,L). Now, since 
the algorithm accepts, it is also able to find a (lasso-shaped) path p only visiting states 
having Agt as the second component. This path has the additional property that any 
“deviation” from a state of Adam along this path ends up in a state that is winning for Eve 
for players in L n P, where P is the set of suspects for the present deviation. This way, if 
during p, Adam deviates to a state (t, P), then Eve will have a strategy to ensure that along 
any subsequent play, the objectives of players in LnP (in G) are not fulhlled, so that along 
any run p', the players in L H X{p') are losing for their objectives in G, so that Eve wins 
mniG,L). 

Conversely, assume that there is a Nash equilibrium satisfying the constraint. Following 
Proposition EH we assume that the outcome of the corresponding strategy profile has 
the form tt • From Lemma 14.41 there is a winning strategy for Eve in P(G,L) whose 
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outcome when Adam obeys follows the outcome of the Nash equilibrium. As a consequence, 
the outcome when Adam obeys is a path p that the algorithm can guess. Indeed, it must 
satisfy the constraints, and any deviation from p with set of suspects P ends in a state 
where Eve wins for the winning condition of ^{Q, L), hence also for the winning condition 
of V-iG, L n P), since any path p' visiting (t, P) has X{p') C P. 

Finally, our global algorithm is as follows: we run the hrst oracle for all the states and 
all the sets L that are subsets of a set of suspects (we know that there are polynomially 
many such inputs). We also run the second algorithm on all the possible values for M, 
which are also polynomially many. Now, from the answers of the first oracle, we compute 
the exact value M, and return the value given by the second on that input. This algorithm 
runs in and decides the constrained NE existence problem. 


5.6.3. Deterministic algorithm. In the next section we will need a deterministic algorithm to 
solve games with objectives given as deterministic Rabin automata. We therefore present it 
right now. The deterministic algorithm works by successively trying all the possible payoffs, 
there are of them. Then it computes the winning strategies of the suspect game for 

that payoff. In [25] an algorithm for Streett games is given, which works in time 0{n^ ■ k\), 
where n is the number of vertices in the game, and k the size of the Streett condition. The 
algorithm has to find, in the winning region of Eve in J{G), a lasso that satisfies the Rabin 
winning conditions of the winners and do not satisfy whose of the losers. To do so it tries 
all the possible choices of elementary Rabin condition that are satisfied to make the players 
win, there are at most OAeAgt possible choices. And for the losers, we try the possible 
choices for whether Qi^A is visited of not, there are OyiGAgt such choices. It then looks 
for a lasso cycle that, when A is a winner, does not visit and visits and when A 

is a loser, visits Ri^^A when it has to, or does not visit This is equivalent to finding 

a path satisfying a conjunction of Biichi conditions and can be done in polynomial time 
0{n X X^AeAgt ^a)- The global algorithm works in time 


O 





(E*A! + 

A 


n ^A-2'^^ 

AeAgt 
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Notice that the exponential does not come from the size of the graph but from the number of 
agents and the number of elementary Rabin conditions, this will be important when in the 
next subsection we will reuse the algorithm on a game structure whose size is exponential. 


5.6.4. -hardness. We now prove Py^'^-hardness of the (constrained) NE existence prob¬ 
lem in the case of parity objectives. The main reduction is an encoding of the 0SAT 
problem, where the aim is to decide whether the number of satisfiable instances among a 
set of formulas is even. This problem is known to be complete for Py''^ |22j . 

Before tackling the whole reduction, we first develop some preliminaries on single in¬ 
stances of SAT, inspired from [12j . Let us consider an instance (/> = ci A - • • Ac^ of SAT, where 
Ci = V li ^2 V -^ 1 , 3 , and lij G {xk, -'Xk | 1 < fc < p}. With cj), we associate a three-player 
game N{(l)), depicted on Eigure fTHl fwhere the first state of N{(f>) is controlled by Ai, and 
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Figure 15. The game N((p) (left), where N'{ci) is the module on the right. 

the first state of each N'{cj) is concurrently controlled by A2 and ^ 3 ). For each variable Xj, 
players A2 and A^ have the following target sets: 

This construction enjoys interesting properties, given by the following lemma: 

Lemma 5.19. If the formula (f is not satisfiable, then there is a strategy for player Ai 
in N{(p) such that players A2 and A3, lose. If the formula cf is satisfiable, then for any 
strategy profile CTAgt, one of A2 and A3 can change her strategy and win. 

Proof. We begin with the first statement, assuming that 4 > is not satisfiable and defining the 
strategy for Ai. With a history h in we associate a valuation : {xk \ k € [l,p]} ^ 

{T, T} (where p is the number of distinct variables in cf), defined as follows: 

v^{xk) = T 3m. hm = Xk A \/m' > m. h^' 7 ^ ~'Xk for all k G [l,p] 

We also define v^{-<Xk) = -<v^{xk). Under this definition, v^{xk) = T if the last occurrence 
of Xk or -iXfc along h was Xk. We then define a strategy ai for player Ai: after a history h 
ending in an Ai-state, we require (Ti{h) to go to N'{ci) for some c* (with least index, say) 
that evaluates to false under (such a Cj exists since cj) is not satisfiable). This strategy 
enforces that if h ■ ai{h) ■ is a finite outcome of fii, then = T, because Ai has 

selected a clause Cj whose literals all evaluate to T. Moreover, {£ij) = T, so that 

for each j, any outcome of fii will either alternate between Xk and -^Xk (hence visit both 
of them infinitely often), or no longer visit any of them after some point. Hence both A2 
and A3 lose. 

We now prove the second statement. Let u be a valuation under which (f evaluates 
to true, and UAgt be a strategy profile. From < 7^2 and < 7 ^ 3 , we define two strategies a'j^^ 
and < 7 )^^. Consider a finite history h ending in the first state of N'{ci), for some i. Pick 
a literal iij of c* that is true under v (the one with least index, say). We set 

= [j - o-Aaih) (mod 3)] = [j - aAiih) (mod 3)]. 

It is easily checked that, when aA2 and (or and a A3) are played simultaneously 
in the first state of some N'{ci), then the game goes to iij. Thus under those strategies, 
any visited literal evaluates to true under v, which means that at most one of Xk and -iXfc 
is visited (infinitely often). Hence one of A2 and A3 is winning, which proves our claim. 

□ 











PURE NASH EQUILIBRIA IN CONCURRENT DETERMINISTIC GAMES 


41 


We now proceed by encoding an instance 

3x\,...xl. (I^{x\,...,xl) 

of 0 SAT into a parity game. The game involves the three players Ai, A 2 and A 3 of the 
game N[<j)) defined above, and it will contain a copy of N[(lf) for each 1 < r < m. The ob¬ 
jectives of A 2 and A 3 are the unions of their objectives in each e.g. = 

p"^^{x‘j) = ■ ■ ■ = p^^{x^) = 2j. 

For each such r, the game will also contain a copy of the game M{(lf ) depicted on 
Figure [T3j Each game M(<p'^) involves an extra set of players B"^, one for each variable 
As we have seen in Section 15.41 in a Nash equilibrium, it cannot be the case that both 
and -ix^ are visited inhnitely often. 

In order to test the parity of the number of satishable formulas, we then define two 
families of modules, depicted on Figure [T6l to [T9l Finally, the whole game Q is depicted on 
Figure [ 20 I In that game, the objective of Ai is to visit inhnitely often the initial state init. 



Figure 16. Module for r > 2 Figure 17. Module G{(j)'^) for r > 2 



Fig. 18. Module FI((?!)^) Fig. 19. Module G((/>^) Fig. 20. The game 

Lemma 5.20. There is a Nash equilibrium in the game G where A 2 and A 3 lose and Ai 
wins if, and only if, the number of satisfiable formulas is even. 

Proof. Assume that there is a Nash equilibrium in G where Ai wins and both A 2 and A 3 
lose. Let p be its outcome. As already noted, if p visits module inhnitely often, 

then it cannot be the case that both x^ and -ix^ are visited inhnitely often in as 

otherwise would be losing and have the opportunity to improve her payoff. This implies 
that (jf is satishable. Similarly, if p visits inhnitely often the states of H[(jf) or that 

is controlled by A 2 and A 3 , then it must be the case that cjf is not satishable, since from 
Lemma 15.191 this would imply that A 2 or A 3 could deviate and improve her payoff by going 
to iV((/)''). 
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We now show by induction on r that if p goes infinitely often in module G{(ff) then 
#{j < r I 0 '’ is satishable} is even, and that (if n > 1 ) this number is odd if p goes infinitely 
in module H{ 4 >'^). 

When r = 1, since is is satishable, as noted above. Similarly, if p 

visits inhnitely often, it also visits its 742 /j 43 -state inhnitely often, so that is not 

satishable. This proves the base case. 

Assume that the result holds up to some r — 1 , and assume that p visits G{(f)'') inhnitely 
often. Two cases may occur: 

• it can be the case that M{(jf) is visited inhnitely often, as well as Then (jf is 

satishable, and the number of satishable formulas with index less than or equal to r — 1 
is odd. Hence the number of satishable formulas with index less than or equal to r is 
even. 

• it can also be the case that the state A^jA-^ of G{4>'^) is visited inhnitely often. Then ([f is 
not satishable. Moreover, since Ai wins, the play will also visit G{(j)'^~^) inhnitely often, 
so that the number of satishable formulas with index less than or equal to r is even. 

If p visits H{(jf) inhnitely often, using similar arguments we prove that the number of 
satishable formulas with index less than or equal to r is odd. 

To conclude, since Ai wins, the play visits G((/>™') inhnitely often, so that the total 
number of satishable formulas is even. 

Conversely, assume that the number of satishable formulas is even. We build a strat¬ 
egy prohle, which we prove is a Nash equilibrium in which Ai wins, and A 2 and A 3 lose. 
The strategy for Ai in the initial states of H{(jf) and G{(jf) is to go to when (j/ 

is satishable, and to state A 2 /A 3 otherwise. In the strategy is to play according 

to a valuation satisfying (p"^. In N{(p'"), it follows a strategy along which A 2 and A 3 lose 
(this exists according to Lemma l5.19p . This dehnes the strategy for Ai. Then A 2 and A 3 
are required to always play the same move, so that the play never goes to some N{(j/). 
In they can play any strategy (they lose anyway, whatever they do). Finally, the 

strategy of B^. never goes to T. 

We now explain why this is the Nash equilibrium we are after. First, as Ai plays 
according to hxed valuations for the variables either B^. wins or she does not have 
the opportunity to go to T. It remains to prove that Ai wins, and that A 2 and A 3 lose 
and cannot improve (individually). To see this, notice that between two consecutive visits 
to init, exactly one of G{(p'^) and is visited. More precisely, it can be observed that 

the strategy of Ai enforces that G{(jf) is visited if #{r < r' <m \ (jf is satishable} is even, 
and that H{(jf) is visited otherwise. Then if H{(pi) is visited, the number of satishable 
formulas with index between 2 and m is odd, so that pi is satishable and Ai can return 
to init. If G{p^) is visited, an even number of formulas with index between 2 and m is 
satishable, and is not. Hence Ai has a strategy in N{p^) to make A 2 and A 3 lose, so 
that A 2 and A 3 cannot improve their payoffs. □ 

This proves hardness for the constrained NE existence problem with parity objectives. 
For the NE existence problem, we use the construction of Section 18.41 but since it can 
only be used to get rid of constraint of the type “Ai is winning”, we add to the game 
two players, A 4 and A 5 , whose objectives are opposite to A 2 and A 3 respectively, and one 
player Ag that will be playing matching-penny games. The objectives for A 4 and A 5 are 
dehnable by parity objectives, by adding 1 to all the priorities. Then, we consider game 
Q' = £'(E(E(^, Ai, A 6 ,pi), A 4 , A 6 ,P 4 ), As, A 6 ,P 5 ) where pi, p 4 and ps are winning paths 
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for Ai, A 4 and A 5 respectively. Thanks to Proposition 13.51 there is a Nash equilibrium 
in Q' if, and only if, there is a Nash equilibrium in Q where Ai wins and A 2 and ^3 lose. 
We deduce P||''^-hardness for the NE existence problem with parity objectives. 

5.7. Objectives given as deterministic Rabin automata. In order to find Nash equi¬ 
libria when objectives are given as deterministic Rabin automata, we first define the notion 
of game simulation, which we show has the property that when Q' game-simulates Q, then 
a Nash equilibrium in the latter game gives rise to a Nash equilibrium in the former one. 

We then define the product of a game with automata (defining the objectives of the 
players), and show that it game-simulates the original game. This reduces the case of games 
with objectives are defined as Rabin automata to games with Rabin objectives, which we 
handled at the previous section; the resulting algorithm is in EXPTIME. We then show a 
PSPACE lower bound for the problem in the case of objectives given as deterministic Biichi 
automata. This proves the following theorem: 

Theorem 5.21. For finite games with single objectives given as deterministic Rabin au¬ 
tomata or deterministic Biichi automata, the NE existence problem and the eonstrained NE 
existenee problem are in EXPTIME and PSPACE-hard. 

It must be noticed that game simulation can be used in other contexts: in particular, 
in m (where we introduced this notion), it is shown that a region-based abstraction of timed 
games game simulates its original timed game, which provides a way of computing Nash 
equilibria in timed games. 

5.7.1. Game simulation. We define game simulation and show how that can be used to 
compute Nash equilibria. We then apply it to objectives given as deterministic Rabin 
automata. 

Definition 5.22. Consider two games Q = (States, Agt, Act, Mov, Tab, (;^y 4 )AeAgt) and 
G' = (States', Agt, Act', Mov', Tab', (;^( 4 )AeAgt) with the same set Agt of players. A relation 
<1 C States X States' is a game simulation if s <1 s' implies that for each move mAgt in G 
there exists a move w-Agt in G' such that: 

(1) Tab(s,mAgt) <1 Tab'(s', rn^^g^.), and 

( 2 ) for each t' € States' there exists t € States with t <t' and 
Susp((s',t'),m'AgJ C Susp((s,t),mAgt)- 

If <] is a game simulation and (so,So) E <, we say that G' game-simulates (or simply sim¬ 
ulates) G- When there are two paths p and p' such that p=i < pN for all i E N, we will 
simply write p < p'- 

A game simulation <1 is preference-preserving from (so,Sg) E States x States' if for all 
Pi,P 2 E Playg(so) and p'i,P 2 E Playg/(sQ) with pi <\ p[ and p 2 < p' 2 , for all A E Agt it holds 
that Pi ;<A P 2 iff p'l P 2 - 

As we show now, Nash equilibria are preserved by game simulation, in the following 
sense: 

Proposition 5.23. LetQ = (States, Agt, Act, Mov, Tab, (;X^)^g^gj.) andG' = (States', Agt, 
Act', Mov', Tab', (;^'A)AGAgt) be two games involving the same set of players. Fix two states 
So and Sq in Q and G' respectively, and let < be a preferenee-preserving game simulation 
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from (so, Sq). If there exists a Nash equilibrium cJAgt in Q from so, then there exists a Nash 
equilibrium in Q' from Sq with Outg(so, crAgt) <l Outg'(sg, 

Proof. We fix a strategy profile cTAgt in G and p the outcome of UAgt from sq- We derive a 
strategy profile cr'^gt i’^ i^"® outcome p' from Sg, such that: 

(a) for every p' € Playg/(sQ), there exists p G Playg(so) s.t. p < p^ and Susp(p', cr(^g^) C 

Susp(p, CTAgt); 

(b) p<p'. 

Assume we have done the construction, and that CTAgt is a Nash equilibrium in Q. 
We prove that ^ Nash equilibrium in Q'. Towards a contradiction, assume that some 

player A has a strategy a'y^ in Q' such that p' p', where p' = Outg'(s', CT(^g^[A i-A ^( 4 ]). 
Note that A € Susp(p', CT^^gj.). Applying ((a)) above, there exists p G Playg(so) such that 
p <1 p' and Susp(p', CT^^gj.) C Susp(p, CTAgt)- In particular, A G Susp(p, CTAgt), and there exists 
a strategy cta for A such that p = Outg(so, CTAgt'^ ^])- As p 0 p' (by ((b))) and 0 is 
preference-preserving from (so,^^)), p P, which contradicts the fact that CTAgt is a Nash 
equilibrium. Hence, o'Agt is a Nash equilibrium in G' from Sq. 

It remains to show how we construct CT^^g^ (and p'). We first build p' inductively, and 
define CT^^g^. along that path. 

• Initially, we let pLg = s'q. Since <1 is a game simulation containing (so,Sg), we have 
So < Sq, and there is a move ni^^g^ associated with CTAgt(so) satisfying the conditions of 
Definition [5.221 Then p=o < pLo^ and Susp(pLg, CT( 4 g^(p'^o)) C Susp(p=o, CTAgt(/o=o))- 

• Assume we have built p<j and cr^^gj. on all the prefixes of p<j, and that they are such that 
P<i < p'<i and Susp(p<j, o'Agt) ^ Susp(p<j, CTAgt) (notice that Susp(p<j, ci^^gj only depends 
on the value of ci^^g^. on all the prefixes of p<i). In particular, we have p=i < pN, so that 
with the move CTAgt(p<j), we can associate a move ?^Agt (to which we set 
satisfying both conditions of Definition 15.221 This defines pLj+i ia such a way that 
p<i+i <] p'<._^i; moreover, Susp(p'<i+ 4 ,CT^^gJ = Susp(p'<i,ci^^gj n Susp((pLi,pLi+i),"^Agt) 
is indeed a subset of Susp(p<i+i, CTAgt)- 

It remains to define cr^^g^. outside its outcome p'. Notice that, for our purposes, it suffices to 
define on histories starting from Sg. We again proceed by induction on the length of the 
histories, defining cj^^gj. in order to satisfy ((a)) on prefixes of plays of G' from Sg. At each 
step, we also make sure that for every h' G Histg/(sQ), there exists h G Histp(s) such that 
h < h', Susp(/i', CT^^gj.) C Susp(/i, CTAgt), and CTAgt(^) and satisfy the conditions of 

Definition 15.221 in the last states of h and h', resp. 

As we only consider histories from s'q, the case of histories of length zero was already 
handled. Assume we have defined for histories h' of length i, and hx a new history 
h' ■ t' G Histp/(sQ) of length i -|- 1 (that is not a prefix of p). By induction hypothesis, 
there is h G Histg(so) such that h < h', and Susp(/i',cr^^g^) C Susp(/i,CTAgt), and CTAgt(h) 
and CTAgt(^0 satisfy the required properties. In particular, with t', we can associate t s.t. 
t < t' and Susp((last(/i'),t'),CT( 4 g^(/i')) C Susp((last(/i),t),CTAgt(Ii))- Then {h ■ t) < ih' - t'). 
Since t < t', there is a move RT-Agt associated with CTAgt (^ • t) and satisfying the conditions of 
Definition 15.221 Letting (y'p^^.^{h' ■ t') = we fulfill all the requirements of our induction 

hypothesis. 
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We now need to lift the property from histories to infinite paths. Consider a play 
'p' E Playg/(sQ), we will construct a corresponding play p in Q- Set po = ^o- If P has 
been defined up to index i and pj <\ p^ (this is true for i = 0), thanks to the way is 
constructed, crAgt(p<j) and ^AgtCp^cj) satisfy the conditions of Definition 15.221 in p<j and 
p', respectively. We then pick 'Pi^i such that 'Pi^i < Pi+i and Susp((pj,pj+i),(TAgt(Pi)) C 
Susp((p', p'_g]^), (T(^g^(p')). This being true at each step, the path p that is obtained, is such 
that p <J p^ and Susp(p', (TAgt) ^ Susp(p, cJAgt)- This is the desired property. □ 

5.7.2. Product of a game with deterministic Rabin automata. After this digression on game 
simulation, we come back to the game G = (States, Agt, Act, Mov, Tab, (;:jA)AGAgt )5 where 
we assume that some player A has her objective given by a deterministic Rabin automaton 
A = {Q,States,d,Qo, (Qi, Ri)i£ii,nj) (recall that this automaton reads sequences of states 
of Q, and accepts the paths that are winning for player A). We show how to compute Nash 
equilibria in G by building a product G' of G with the automaton A and by computing the 
Nash equilibria in the resulting game, with a Rabin winning condition for A. 

We define the product of the game G with the automaton A as the game G tx A = 
(States', Agt, Act, Mov', Tab', (;^A)AeAgt)) where: 

• States' = States x Q; 

• Mov'((s, g), Aj) = Mov(s,Aj) for every Aj E Agt; 

• Tab'((s, g), mAgt) = {s',q') where Tab(s,mAgt) = s' and S{q,s) = q'] 

• If R = A then is given by the internal Rabin condition Q' = States x Qi and 
R' = States x R'-. Otherwise is derived from defined by p p if, and only if, 
proj{p) proj(jj) (where proj is the projection of States' on States). Notice that if 

is an internal Rabin condition, then so is 

Lemma 5 . 24 . G ^ A game-simulates G, with game simulation defined according to the 
projection: s < is',q) iff s = s'. This game simulation is preference-preserving. 

Conversely, G game-simulates G <>< A, with game simulation defined by (s, q) <' s' iff 
s = s', which is also preference-preserving. 

Proof. We begin with proving that both relations are preference-preserving. First notice 
that if {{sji, qn))n>o is a play in t? x A, then its proj-projection {sn)n>o is a play in G- 
Conversely, if p = {sn)n>o is a play in G, then there is a unique path {qn)n>o from initial 
state qo in A which reads it, and {{sn,qn))n>o is then a path in x A that we write 
proj~^{p) = {{sn,qn))n>o- That way, proj defines a one-to-one correspondence between 
plays in G and plays in ^ x A where the second component starts in q^. For a player B A, 
the objective is defined so that proj{p) has the same payoff as p. Consider now player 
A, she is winning in ^ for p = (s„)n>o iff (sn)n>o G ^(^) iff the unique path {qn)n>o 
from initial state go that reads {sn)n>o satisfies the Rabin condition (Qi, Rj)ig|i,n] i^ iff 
proj~^{p) satisfies the internal Rabin condition (Q^, R()iG[i,n] in ^ x A. This proves that <i 
is winning-preserving. 

It remains to show that both relations are game simulations. Assume s < (s, g) and pick 
a move m-Agt in G- It is also a move in ^ x A, and Tab'((s, g), mAgt) = (Tab(s, mAgt), 5(g, s))- 
By definition of <i it then holds that Tab(s,mAgt) <i Tab'((s, g), m-Agt), which proves condi¬ 
tion © of the dehnition of a game simulation. It remains to show condition ([2]). Pick a 
state (s',g') E States'. We distinguish two cases 
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• If 6 {q,s) 7 ^ q' then Susp(((s, g), {s',q')),mAgt) = 0, and condition ([2]) trivially holds. 

• Otherwise 6 {q,s) = q'. In that case, for any move we have that Tab(s,m^g,.) = s' 

if, and only if, Tab'((s, g), = {s'^q'). It follows that Susp(((s, ( 7 ), (s', (?')), mAgt) = 

Susp((s, s'), mAgt), which implies condition Q. 

This proves that Q k A game-simulates Q. 

We now assume (s, q) <\' s and pick a move rn-Agt in G ^ A. It is also a move in Q, 
and as previously, condition ([T]) obviously holds. Pick now s' € States. We define q' = 
6 {q, s), and we have (s', q') < s' by definition of <i'. As before, we get condition ([2]) because 
Susp(((s, q), (s', g')), mAgt) = Susp((s, s'), mAgt)- □ 

We will solve the case where each player’s objective is given by a deterministic Rabin 
automaton by applying the above result inductively. We will obtain a game where each 
player has an internal Rabin winning condition. Applying Proposition 15.231 each time, 
we get the following result: 

Proposition 5.25. Let G = (States, Agt, Act, Mov, Tab, (;:jA)AGAgt) be a finite concurrent 
game, where for each player A, the preference relation is single-objective given by a 
deterministic Rabin automaton A. Write Agt = {Ai, ..., An}. There is a Nash equilibrium 
(TAgt in G from some state s with outcome p iff there is a Nash equilibrium cr^^g^ in G' = 
{{{G x . 4 , 1 ) K A 2 ) • • • X An) from (s, qoi,..., gon) with outcome p', where qoi is the initial state 
of Ai and p is the projection of p' on G- 


5.7.3. Algorithm. Assume that the objective of player Ai is given by a deterministic Rabin 
automaton Ai. The algorithm for solving the constrained NE existence problem starts by 
computing the product of the game with the automata: G' = {{{G x 4.i) x A 2 ) • • • x An). 
The resulting game has size \G\ x Ojeiin] \^j\i which is exponential in the number of 
players. For each player Aj (1 < j < n), the number of Rabin pairs in the product game 
is that of the original specification Aj, say kj. We then apply the deterministic algorithm 
that we have designed for Rabin objectives (see Subsection 15.6.31 page 13^ . which yields an 
exponential-time algorithm in our framework. 

5.7.4. Hardness. We prove PSPACE-hardness in the restricted case of deterministic Biichi 
automata, by a reduction from (the complement of) the problem of the emptiness of the 
intersection of several language given by deterministic finite automata. This problem is 
known to be PS PACE-complete [291 Lemma 3.2.3]. 

We fix finite automata ^ 1 ,... ,An over alphabet S. Let S' = S U {init, final}, where 
init and final are two special symbols not in S. For every j G [l,u], we construct a Biichi 
automaton A'j from Aj as follows. We add a state F with a self-loop labelled by final and 
an initial state I with a transition labelled by init to the original initial state. We add 
transitions labelled by final from every terminal state to F. We set the Biichi condition 
to {F}. If Cj is the language recognised by Aj, then the language recognised by the Biichi 
automaton A'j is C'j = init - Cj ■ finaf. The intersection of the languages recognised by the 
automata Aj is empty if, and only if, the intersection of the languages recognised by the 
automata A'j is empty. 

We construct the game G, with States = S'. For each j G [l,u], there is a player Aj 
whose objective is given by A'j and one special player Aq whose objective is States‘^ (she is 
always winning). Player Aq controls all the states and there are transitions from any state 
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to the states of S U {final}. Formally Act = S U {final} U ±, for all state s € States, 
Mov(s,Ao) = Act, and if j / 0 then Mov(s,Aj) = {_L} and for all a E S U {final}, 
Tab(s, (a, X,, X)) = a. 

Lemma 5 . 26 . There is a Nash equilibrium in game Q from init where every player wins if, 
and only if, the intersection of the languages recognised by the automata A} is not empty. 

Proof. If there is such a Nash equilibrium, let p be its outcome. The path p forms a word 
of S', it is accepted by every automata A!j since every player wins. Hence the intersection 
of the languages Cj is not empty. 

Conversely, if a word w = init ■ wi ■ W 2 - ■ ■ is accepted by all the automata, player Aq 
can play in a way such that everybody is winning: if at each step j she plays Wj, then 
the outcome is w which is accepted by all the automata. It is a Nash equilibrium since Aq 
controls everything and cannot improve her payoff. □ 

Since PSPACE is stable by complementation, this proves that the constrained NE exis¬ 
tence problem is PSPACE-hard for objectives described by Biichi automata. 

In order to prove hardness for the NE existence problem we use results from Section [3X1 
Winning conditions in E{E{... (E{Q,An, Ao,p „),..., A2, Aq, P2), Ai, Aq, pi), where pj is a 
winning play for Ai, can be defined by slightly modifying automata A!i,..., A'^ to take into 
account the new states. By Proposition 13.51 there exists a Nash equilibrium in this game if, 
and only, if there is one in Q where all the players win. Hence PSPACE-hardness also holds 
for the NE existence problem. 

6. Ordered Buchi objectives 

In this Section we assume that preference relations of the players are given by ordered Biichi 
objectives (as defined in Section IX^ . and we prove the results listed in Table [2] (page [3]). 
We first consider the general case of preorders given as Boolean circuits, and then exhibit 
several simpler cases. 

Eor the rest of this section, we fix a game Q = (States, Agt, Act, Mov, Tab, 
and assume that ig given by an ordered Biichi objective oja = {{^t)i<i<nAi (^A)AeAgt)- 


6.1. General case: preorders are given as circuits. 

Theorem 6.1. For finite games with ordered Biichi objectives where preorders are given 
as Boolean circuits, the value problem, the NE existence problem and the constrained NE 
existence problem are PSPACE-complete. 

Proof. We explain the algorithm for the constrained NE existence problem. We assume 
that for each player A, the preorder <^4 is given by a Boolean circuit Ca- The algorithm 
proceeds by trying all the possible payoffs for the players. 

Eix such a payoff {v^)A£Agt, with E {0,for every player A. We build a circuit 
Da which represents a single objective for player A. Inputs to circuit Da will be states of 
the game. This circuit is constructed from Ca as follows: We set all input gates wi ■ ■ ■ Wn oi 
circuit Ca to the value given by payoff The former input Vi receives the disjunction of 
all the states in Qp, We negate the output. It is not hard to check that the new circuit Da 
is such that for every play p, D^[Inf(/ 3 )] evaluates to true if, and only if, payoff^(/9) %a 
i.e. if p is an improvement for player A. 
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Circuit Da is now viewed as a single objective for player A, we write Q' for the new 
game. We look for Nash equilibria in this new game, with payoff 0 for each player. Indeed, 
a Nash equilibrium ciAgt in G with payoff (u"^)AGAgt is a Nash equilibrium in game G' with 
payoff (0 ,..., 0). Conversely a Nash equilibrium ciAgt in game G' with payoff (0 ,..., 0) is a 
Nash equilibrium in Q as soon as the payoff of its outcome (in Q) is (u^)AeAgt- 

We use the algorithm described in Section 15.51 for computing Nash equilibria with 
single objectives given as Boolean circuits, and we slightly modify it to take into account 
the constraint that it has payoff for each player A. This can be done in polynomial 
space, thanks to Proposition 13.11 it is sufficient to look for plays of the form vr • with 
| 7 r| < jStatesp and |r| < |Statesp. 

PSPACE-hardness was proven for single objectives given as a Boolean circuit (the circuit 
evaluates by setting to true all states that are visited infinitely often, and to false all 
other states) in Section 15.51 This kind of objective can therefore be seen as an ordered 
Biichi objective with a preorder given as a Boolean circuit. □ 

6.2. When the ordered objective can be (co-)reduced to a single Biichi objective. 

For some ordered objectives, the preference relation can (efficiently) be reduced to a single 
objective. For instance, a disjunction of several Biichi objectives can obviously be reduced 
to a single Biichi objective, by considering the union of the target sets. Formally, we say 
that an ordered Biichi objective w = ((rii)i<i<n) is reducible to a single Biichi objective if, 
given any payoff vector v, we can construct in polynomial time a target set T{v) such that for 
all paths p, V < payofF^(p) if, and only if, Inf(p)nT(u) ^ 0 . It means that securing payoff v 
corresponds to ensuring infinitely many visits to the new target set. Similarly, we say that eo 
is co-reducible to a single Biichi objective if for any vector v we can construct in polynomial 
time a target set T{v) such that payofF^(/ 9 ) ^ v if, and only if, Inf(/ 3 ) nT(u) ^ 0 . It means 
that improving on payoff v corresponds to ensuring infinitely many visits to the new target 
We prove the following proposition, which exploits (co-)reducibility for efficiently solving 
the various problems. 

Proposition 6.2. 

• For finite games with ordered Biichi objectives which are reducible to single Biichi objec¬ 
tives, and in which the preorders are non-trivic^ and monotonic, the value problem is 
P-complete. 

• For finite games with ordered Biichi objectives which are co-reducible to single Biichi 
objectives, and in which the preorders are non-trivial and monotonic the NE existence 
problem and the constrained NE existence problem are P-complete. 

Note that the hardness results follow from the hardness of the same problems for single 
Biichi objectives (see Section [5j3ll . We now prove the two upper bounds. 

6.2.1. Reducibility to single Biichi objectives and the value problem. We transform the or¬ 
dered Biichi objectives of the considered player into a single Biichi objective, and use a 
polynomial-time algorithm m Chapter 2] to solve the resulting zero-sum (turn-based) 
Biichi game. 


®That is, there is more than one class in the preorder. 
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6.2.2. Co-reducibility to single Biichi objectives and the (constrained) NE existence problem. 
We assume that the ordered objectives {uJA)A£Agt are all co-reducible to single Biichi ob¬ 
jectives. We show that we can use the algorithm presented in Section 15.3.21 to solve the 
constrained NE existence problem in polynomial time. 

We first notice that the preference relations satisfy the hypotheses (a) (see page [30|) : 
(*)a and {-k)b are obvious, and (*)c is by co-reducibility of the ordered objectives. It means 
that we can apply the results of Lemmas 15.101 and 15.111 to the current framework. To be 
able to conclude and apply Lemma 15.121 we need to show that for every payoff v, we can 
compute in polynomial time the set W{Q,v) in the suspect game Ti{G,v). 

Lemma 6.3. Fix a threshold v. The set W{Q,v) ean be computed in polynomial time. 

Proof. As the ordered objectives are co-reducible to single Biichi objectives, we can construct 
in polynomial time target sets T^{v) for each player A. The objective of Eve in the suspect 
game PL{G,K) is then equivalent to a co-Biichi objective with target set {(T^(u, P) \ A ^ P}. 
The winning region W{Q,v) can then be determined using a polynomial time algorithm 
of [231 Sect. 2.5.3]. □ 

6.2.3. Applications. We will give preorders to which the above applies, allowing to infer 
several P-completeness results in Table [2] (those written with reference “Section 16.21 ’ ). 

We first show that reducibility and co-reducibility coincide when the preorder is total. 

Lemma 6.4. Let u = be an ordered Biichi objective, and assume that < is 

total. Then, to is reducible to a single Biichi objective if, and only if, to is co-reducible to a 
single Biichi objective. 

Proof. Let u € {0,1}"' be a vector. If n is a maximal element, the new target set is empty, 
and thus satisfies the property for co-reducibility. Otherwise we pick a vector v among the 
smallest elements that is strictly larger than u. Since the preorder is reducible to a single 
Biichi objective, there is a target set T that is reached infinitely often whenever the payoff 
is greater than v. Since the preorder is total and by choice of v, we have w % u ^ v < w. 
Thus the target set T is visited infinitely often when u is not larger than the payoff. Hence 
ui is co-reducible to a single Biichi objective. 

The proof of the other direction is similar. □ 

Lemma 6.5. Ordered Biichi objectives with disjunction or maximise preorders are reducible 
to single Biichi objectives. Ordered Biichi objectives with disjunction, maximise or subset 
preorders are co-reducible to single Biichi objectives. 

Proof. Let ui = ((Hj)i<i<„, <) be an ordered Biichi objective. Assume Tj is the target set 
for Hj. 

Assume < is the disjunction preorder. If the payoff v is different from 0 then we 
define T{v) as the union of all the target sets: T{v) = UILi Then, for every run p, 

V < payoff^ (p) there is some i for which Inf(p) n Tj 7^ 0 

Inf(p) n T(u) / 0 

If the payoff u is 0 then we get the expected result with T{v) = States. Disjunction being 
a total preorder, it is also co-reducible (from Lemma 16.411 . 
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We assume now that < is the maximise preorder. Given a payoff v, consider the index 
io = max{i | Vi = 1}. We then define T{v) as the union of the target sets that are above io: 
T{v) = Ui>io following four statements are then equivalent, if p is a run: 

V < payoff^(p) V < l{i\inf{p)nTi^0} 

iQ < max{z | Inf(p) n T, 7^ 0} 

3i > io- Inf(p) n Tj / 0 

Hence uj is reducible, and also co-reducible as it is total, to a single Biichi objective. 

Finally, we assume that < is the subset preorder, and we show that u is then co- 
reducible to a single Biichi objective. Given a payoff v, the new target is the union of the 
target sets that are not reached infinitely often for that payoff: T{v) = Then 

the following statements are equivalent, if p is a run: 

payoff^(p) £u 4^ l{j|lnf(p)nTi^0} £ u 

44 3i. Inf(p) n Tj 7^ 0 and rtj = 0 

Inf(p) n T(u) 7^ 0 □ 

As a corollary, we get the following result: 

Corollary 6.6. For finite games with ordered Biichi objectives, with either the disjunction 
or the maximise preorder, the value problem is P-complete. For finite games with ordered 
Biichi objectives, with either the disjunction, the maximise or the subset preorder, the NE 
existence problem and the constrained NE existenee problem are P-eomplete. 

Remark 6.7. Note that we cannot infer P-completeness of the value problem for the subset 
preorder since the subset preorder is not total, and ordered objectives with subset preorder 
are not reducible to single Biichi objectives. Such an ordered objective is actually reducible 
to a generalised Biichi objective (several Biichi objectives should be satisfied). 


6.3. When the ordered objective can be reduced to a deterministic Biichi au¬ 
tomaton objective. For some ordered objectives, the preference relation can (efficiently) 
be reduced to the acceptance by a deterministic Biichi automaton. Formally, we say that 
an ordered objective lv = {{^i)i<i<n,£) is reducible to a deterministic Biichi automaton 
whenever, given any payoff vector u, we can construct in polynomial time a deterministic 
Biichi automaton over States which accepts exactly all plays p with u < payoff^ (p). For 
such preorders, we will see that the value problem can be solved efficiently by constructing 
the product of the deterministic Biichi automaton and the arena of the game. This con¬ 
struction does however not help for solving the (constrained) NE existence problems since 
the number of players is a parameter of the problem, and the size of the resulting game will 
then be exponential. 

Proposition 6.8. For finite games with ordered Biichi objectives which are reducible to 
deterministic Biichi automata, the value problem is P-complete. 

Proof. Given the payoff for player A, the algorithm proceeds by constructing the automa¬ 
ton that recognises the plays with payoff higher than v^. By performing the product with 
the game as described in Section 15.7.21 we obtain a new game, in which there is a winning 
strategy if, and only if, there is a strategy in the original game to ensure payoff v^. In this 
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new game, player A has a single Biichi objective, so that the NE existence of a winning 
strategy can be decided in polynomial time. 

Hardness follows from that of games with single Biichi objectives. □ 

Applications. We now give preorders to which the above result applies, that is, which are 
reducible to deterministic Biichi automata objectives. 

Lemma 6.9. An ordered objective where the preorder is either the conjunction, the subset 
or the lexicographic preorder is reducible to a deterministic Biichi automaton objective. 

Proof. We first focus on the conjunction preorder. Let uj = {{^i)i<i<n, be an ordered 
Biichi objective, where < is the conjunction. For every 1 < i < n, let Tj be the target set 
defining the Biichi condition Hj. There are only two possible payoffs: either all objectives 
are satisfied, or one objective is not satisfied. For the second payoff case, any play has a 
larger payoff: hence the trivial automaton (which accepts all plays) witnesses the property. 
For the first payoff case, we construct a deterministic Biichi automaton B as follows. There 
is one state for each target set, plus one accepting state: Q = {goj Qi, ■ ■ ■, Qn}', the initial 
state is go, and the unique repeated state is Qn- For all 1 < z < n, the transitions are 
Qi-i A qi when s G Ti and qi-i A qi-i otherwise. There are also transitions qn A go for 
every s G States. Automaton B describes the plays that goes through each set Ti infinitely 
often, hence witnesses the property. It can furthermore be computed in polynomial time. 
The construction is illustrated in Figure [211 

We now turn to the subset preorder. Let u = be an ordered Biichi 

objective, where < is the subset preorder. For every 1 < z < n, let Ti be the target set 
defining the Biichi condition Hj. Fix a payoff u. A play p is such that u < payofF^^(p) if, 
and only if, p visits infinitely often all sets Tj with ztj = 1. This is then equivalent to the 
conjunction of all Hj’s with Ui = 1. We therefore apply the previous construction for the 
conjunction and get the expected result. 

We finish this proof with the lexicographic preorder. Let cu = be an 

ordered Biichi objective, where < is the lexicographic preorder. For every 1 < z < n, let 
Ti be the target set defining the Biichi condition Hj. Let u G {0,1}"' be a payoff vector. 
We construct the following deterministic Biichi automaton which recognises the runs whose 
payoff is greater than or equal to u. 

In this automaton there is a state qi for each z such that zzj = 1, and a state go that is 
both initial and repeated: Q = {go} U (g^ | Ui = 1}. We write I = {0} U {z | zzj = 1}. For 
every i G I, we write succ(z) = min(/ \ {j \ j < z}), with the convention that min0 = 0. 
The transition relation is defined as follows: 

• for every s G States, there is a transition go A gsucc(o); 

• for every i G I \ {0}, we have the following transitions: 

_ T 

Qi ' Qsucc(i)t 

rp^ \X'' 

— qi —go with k < i and Uk = 0; 

- qi^ qi for every s G States \ (Tj U Ufc<i,nfe=o A)- 
An example of the construction is given in Figure [22j 

We now prove correctness of this construction. Consider a path that goes from go to go: 
if the automaton is currently in state qi, then since the last occurrence of go, at least one 
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state for each target set Tj with j < i and uj = 1 has been visited. When qo is reached 
again, either it is because we have seen all the Tj with Uj = 1, or it is because the run 
visited some target Tj with u* = 0 and all the Tj such that Uj = 1 and j < i; in both cases, 
the set of targets that have been visited between two visits to go describes a payoff greater 
than u. Assume the play tt is accepted by the automaton; then there is a sequence of qi as 
above that is taken infinitely often, therefore payofF^^(7r) is greater than or equal to u for the 
lexicographic order. 

Conversely assume v = payoff^ (vr) is greater than or equal to u, that we already read a 
prefix 7r<fc for some k, and that the current state is go- Reading the hrst symbol in vr after 
position k, the run goes to the state qi where i is the least integer such that Ui = 1. Either 
the path visits Tj at some point, or it visits a state in a target Tj, with j smaller than i and 
Vj = 0, in which case the automaton goes back to go- Therefore from go we can again come 
back to go while reading the following of tt, and the automaton accepts. □ 




Figure 21. The automaton 
for the conjunction preorder, 
n = 3 


Figure 22. The automaton for the 
lexicographic order, n = 7 and u = 
( 0 , 1 , 0 , 0 , 1 , 1 , 0 ) 


We conclude with the following corollary: 

Corollary 6.10. For finite games with ordered Biichi objectives with either of the conjunc¬ 
tion, the lexicographic or the subset preorders, the value problem is P-complete. 


6.4. Preference relations with monotonic preorders. We will see in this part that 
monotonic preorders lead to more efficient algorithms. More precisely we prove the following 
result: 

Proposition 6.11. • For finite games with ordered Biichi objectives where the preorders 
are given by monotonic Boolean circuits, the value problem is in coNP, and the NE exis¬ 
tence problem and the constrained NE existence problem are in NP. 

• Completeness holds in both cases for finite games with ordered Biichi objectives where the 
preorders are given by monotonic Boolean circuits or with the counting preorder. 

• NP-completeness also holds for the constrained NE existence problem for finite games with 
ordered Biichi objectives where the preorders admit an element v such that for every v', 
it holds v' 1 4^ v' < uQ 

We hrst show that monotonicity of the preorders imply some memorylessness property in 
the suspect game. We then give algorithms witnessing the claimed upper bounds, and show 
the various lower bounds. 


'^To be fully formal, a preorder < is in fact a family (<Ti)n6N (where <„ compares two vectors of size n), 
and this condition should be stated as “/or all n, there is an element Vn G {0,1}" such that for all v' G {0,1}", 
it holds v' X 44 v' Vn". 
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6.4.1. When monotonicity implies memorylessness. We say that a strategy a is memoryless 
(resp. memoryless from state sq) if there exists a function /: States —>■ Act such that 
a{h ■ s) = f{s) for every h G Hist (resp. for every h G Hist(so)). A strategy profile is said 
memoryless whenever all strategies of single players are memoryless. We show that when 
the preorders used in the ordered Biichi objectives are monotonic, the three problems are 
also easier than in the general case. This is because we can find memoryless trigger profiles 
(recall Definition 14.11) . 

We first show this lemma, that will then be applied to the suspect game. 

Lemma 6.12. Let Li he a turn-based two-player game. Call Eve one player, and let be 
a strategy for Eve, and sq be a state of LI. There is a memoryless strategy such that for 
every p' G Out'^(so, there exists p G Out'^(so, < 73 ) such that Inf(/9') C Inf(/9). 

Proof. This proof is by induction on the size of the set 

5'((Ti) = {(s,m) I G Hist((Ti). ai{h) = m and last(/i) = s}. 

If its size is the same as that of {s | G Hist((Ti). last(/i) = s} then the strategy is 
memoryless. Otherwise, let s be a state at which cti takes several different actions (i.e., 
|({s} X Act) n 5 '(iTi)| > 1). 

We will define a new strategy crj that takes fewer different actions in s and such that 
for every outcome of crj, there is an outcome of (Ti that visits (at least) the same states 
infinitely often. 

If CT is a strategy and h is a history, we let aoh: h' i-A a{h-h') for any history h'. Then for 
every m such that {s,m) G S{ai) we let Hm = {h £ Hist((Ti) | last(/i) = s and ai{h) = m}, 
and for every h, h~^ ■ Hm = {h' \ h ■ h' £ Hm}. We pick m such that Hm is not empty. 

• Assume that there is ho £ Hist(iTi) with last (ho) = s, such that h^^ ■ Hm. is empty. 
We dehne a new strategy as follows. If h is an history which does not visit s, then 
o'[{h) = (Ti{h). If h is an history which visits s, then decompose h as h' ■ h" where 
last(h') = s is the first visit to s and define a}{h) = ai{ho ■ h"). Then, strategy a} does 
not use m at state s, and therefore at least one action has been “removed” from the 
strategy. More precisely, |({s} x Act) fl 5 '((t()| < |({s} x Act) fl 5 '((Ti)| — 1. Furthermore 
the conditions on infinite states which are visited inhnitely often by outcomes of a} is 
also satisfied. 

• Otherwise for any h £ Hist((Ti) with last(/i) = s, h~^ ■ Hm is not empty. We will construct 
a strategy a} which plays m at s. Let h be an history, we hrst define the extension e{h) 
inductively in that way: 

— e(e) = e, where e is the empty history; 

— e{h ■ s) = e{h) ■ h' where h' £ {e{h))~^ ■ Hm\ 

— e{h ■ s') = e{h) ■ s' if s' 7 ^ s. 

We extend the definition of e to infinite outcomes in the natural way: e{p)i = e(p<j)j. 
We then dehne the strategy (t( : h i-A ai{e{h)). We show that if p is an outcome of (t(, 
then e{p) is an outcome of ai. Indeed assume h is a hnite outcome of (t(, that e{h) is 
an outcome of ai and last(/i) = last(e(/i)). If h ■ s is an outcome of (t(, by construction 
of e, e{h ■ s) = e(h) ■ h', such that last(/i') = s, and h' is an outcome of ai o e{h) and as 
e{h) is an outcome of cti by hypothesis, that means that e{h ■ s) is an outcome of ai. If 
h ■ s' with s' 7 ^ s is an outcome of (t(, e{h ■ s') = e{h) ■ s', s' £ Tab(last(/i),c 7 ((/i)), and 
cr'^{h) = ai{e{h)). Using the hypothesis last(/i) = last(e(/i)), and e{h) is an outcome of ui, 
therefore e{h ■ s') is an outcome of ai. This shows that if p is an outcome of a'l then 
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e{p) is an outcome of ai. The property on states visited infinitely often follows. Several 
moves have been removed from the strategy at s (since the strategy is now memoryless 
at s, playing m). 

In all cases we have ^(cr^) strictly included in S{ai), and an inductive reasoning entails the 
result. □ 

Lemma 6.13. If for every player A, is monotonic, and if there is a trigger profile for 
some play vr from s, then there is a memoryless winning strategy for Eve in from 

state (s,Agt). 

Proof. Assume there is a trigger profile for vr. We have seen in Lemma 14.41 that there is 
then a winning strategy in game for Eve. Consider the memoryless strategy cr^ 

constructed as in Lemma 16.121 Let p' be an outcome of , there is an outcome p of <73 
such that Inf(/ 9 ') C Inf(/?). As <73 is winning in for every A € A(p), proji{p) vr. 

We assume the Biichi conditions are given by the target sets {T^)A,i. For each player A, 
{i I lnf{proji{p')) C {i \ lnf{proji{p)) PT^}. As the preorder is monotonic the payoff 

of proji{p') is smaller than that of proji{p)\ proji{p') Vfojiip). So the play is winning 
for any player A and < 7 ^ is a memory less winning strategy in game for Eve. □ 

Lemma 6.14. If for every player A, <a is given by monotonic Boolean circuits, then given 
a path TT, we can decide in polynomial time if a memoryless strategy for Eve in 'H{G,Tr) is 
winning. 

Proof. Let <73 be a memoryless strategy in for Eve. By keeping only the edges that 

are taken by < 73 , we dehne a subgraph of the game. We can compute in polynomial time 
the strongly connected components of this graph. If one component is reachable and does 
not satisfy the objective of Eve, then the strategy is not winning. Conversely if all the 
reachable strongly connected components satisfy the winning condition of Eve, since the 
preorder is monotonic, <73 is a winning strategy. Notice that since the preorder is given as a 
Boolean circuit, we can check in polynomial time whether a strongly connected component 
is winning or not. Globally the algorithm is therefore polynomial-time. □ 

We now turn to the proof of the claimed upper bounds. 

6.4.2. Proofs for the upper bounds. We show that the value problem is in coNP for hnite 
games with ordered Biichi objectives, when preorders are given by monotonic Boolean 
circuits. 

As already mentioned at the beginning of Section [5l for the value problem, we can 
make the concurrent game tnrn-based: since player A must win against any strategy of the 
coalition P = Agt \ {A}, she must also win in the case where the opponents’ strategies can 
adapt to what A plays. In other terms, we can make A play first, and then the coalition. 
This turn-based game is determined, so that there is a strategy a whose outcomes are always 
better (for A) than if, and only if, for any strategy a' of coalition P, there is an outcome 
with payoff (for A) better than If there is a counterexample to this fact, then thanks to 
Lemma l6 .12 1 there is one with a memoryless strategy a'. The coNP algorithm proceeds by 
checking that all the memoryless strategies of coalition P have an outcome better than v^, 
which is achievable in polynomial time, with a method similar to Lemma 16.141 

We show now that the constrained NE existence problem is in NP for finite games with 
ordered Biichi objectives, when preorders are given by monotonic Boolean circuits. 
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The algorithm for the constrained NE existence problem proceeds by guessing: 

• the payoff for each player, 

• a play of the form vr • , where |7r| < |Statesp and |t| < jStatesp, 

• an under-approximation W of the set of winning states in 'H{Q,7r ■ t^) 

• a memoryless strategy prohle UAgt in ■ t^). 

We check that iTAgt is a witness for the fact that the states in W are winning; thanks to 
Lemma [6.14l this can be done in polynomial time. We also verify that the play has the 
expected payoff, that the payoff satisfies the constraints, and that it never gets out of IT. If 
these conditions are fulfilled, then the play tt • meets the conditions of Theorem 14.51 and 
there is a Nash equilibrium with outcome tt • r^. Lemma 16.131 and Proposition 13.11 ensure 
that if there is a Nash equilibrium, we can find it this way. 

6.4.3. Proofs for the hardness results. We first prove the hardness results for the counting 
preorder. 

Lemma 6.15. For finite games with ordered Biichi objectives that use the counting preorder, 
the value problem is coUP-hard. 

Proof. We reduce (the complement of) 3SAT into the value problem for two-player turn-based 
games with Biichi objectives with the counting preorder. Consider an instance 

(j) = Cl A • • • A Cm 

with Cj = V V over a set of variables {xi,... ,x„}. With (f, we associate a 
two-player turn-based game Q. Its set of states is made of 

• a set containing the unique initial state Vq = {so}) 

• a set of two states 14 = {xk, ~^Xk} for each 1 < fc < n, 

• and a set of three states Vn+j = 24 ^, 3 } each 1 < j < m. 

Then, for each < I < n + m, there is a transition between any state of V) and any state 
of Vj+i (assuming Vn+m+i = Vo)- 

The game involves two players: player B owns all the states, but has no objectives 
(she always loses). Player A has a set of Biichi objectives defined by U {tj^p \ 

ij^p = Xk}, = {~^Xk} U {tj^p I Ij^p = -'Xfc}, for 1 < A; < n. Notice that at least n of 

these objectives will be visited infinitely often along any infinite play. We prove that if the 
formula is not satisfiable, then at least n -|- 1 objectives will be fulfilled, and conversely. 

Assume the formula is satisfiable, and pick a witnessing valuation v. We define a 
strategy as for B that “follows” valuation v. from states in 14_i, for any \ < k < n, the 
strategy plays towards Xk if v{xk) = true (and to ^Xk otherwise). Then, from a state 
in 14 _|_;_i with 1 < I < m, \t plays towards one of the tj^p that evaluates to true under v 
(the one with least index p, say). This way, the number of targets of player A that are 
visited infinitely often is n. 

Conversely, pick a play in Q s.t. at most (hence exactly) n objectives of A are fulfilled. 
In particular, for any 1 < A: < n, this play never visits one of Xk and —'Xk, so that it dehnes 
a valuation v over {xi,..., Xn}- Moreover, any state of 14+;, with 1 < / < p, that is visited 
infinitely often must correspond to a literal that is made true by v, as otherwise this would 
make one more objective that is fulfilled for A. As a consequence, each clause of (f> evaluates 
to true under v, and the result follows. □ 
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Example 6.16. We illustrate the construction of the previous proof in Figure [23] for the 
formula 

ip = (xi V a:2 V -'X3) A (-ixi V X2 V -'X3). ( 6 - 1 ) 

The targets for player A are Ti = T 2 = T 3 = {x 2 ,ti, 2 ,^ 2 , 2 }, ^4 = 

{- 1 X 2 }, T 5 = {xs}, Tq = {“'X 3 , ti^ 3 , t 2 , 3 }- Player A cannot ensure visiting infinitely often 
four target sets, therefore the formula is satisfiable. 

Lemma 6.17. For finite games with ordered Biichi objectives that use the counting preorder, 
the NE existence problem is NP-hard. 

Proof. Let Q be the game we constructed for Lemma l6.151 We construct the game Q" from 
G as described in Section 13.31 The preference in Q' can still be described with ordered 
Biichi objectives and the counting preorder: the only target set of B is {si} and we add 
Si to n different targets of A, where re is the number of variables as in Lemma 16.151 From 
Proposition 13.41 there is a Nash equilibrium in Q" from sq if, and only if, A cannot ensure 
visiting at least re + 1 targets infinitely often. Hence the NE existence problem is NP-hard. 

□ 

This proves also NP-hardness for the constrained NE existence problem for ordered 
Biichi objectives with the counting preorder. Hardness results for preorders given by mono¬ 
tonic Boolean circuits follow from the above since the counting preorder is a special case of 
preorder given as a monotonic Boolean circuit (and the counting preorder can be expressed 
as a polynomial-size monotonic Boolean circuit). 

We now show hardness in the special case of preorders with (roughly) at most one 
maximal element below 1 . 

Lemma 6.18. For finite turn-based games with ordered Biichi objectives with a monotonic 
preorder for which there is an element v such that for every v', v' 1 ^ v' < v, the 
constrained NE existence problem is NP-hard. 

Proof. Let us consider a formula 1 ^ = Ci A • • • A Cm For each variable Xj, our game has one 
player Bi and three states Sj, Xi and -iXj. The objectives of Bi are the sets {xi} and {^Xi}. 
Transitions go from each s* to Xi and ->Xj, and from Xi and -iXj to Sj+i (with = sq). 
Finally, an extra player A has full control of the game (i.e., she owns all the states) and has 
re objectives, defined by for 1 < i < re. The construction is illustrated 

in Figure]^ 

We show that formula f is satisfiable if, and only if, there is a Nash equilibrium where 
each player Bi gets payoff fii satisfying fii^v (hence fii (1,1)), and player A gets payoff 1. 
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Figure 24. The Biichi game for a formula with 4 variables 

First assume that the formula is satisfiable, and pick a witnessing valuation u. By play¬ 
ing according to u, player A can satisfy all of her objectives (hence she cannot improve her 
payoff, since the preorder is monotonic). Since she alone controls all the game, the other 
players cannot improve their payoff, so that this is a Nash equilibrium. Moreover, since A 
plays memoryless, only one of Xi and -■Xj is visited for each i, so that the payoff /3i for Bi 
satisfies A < n. Conversely, if there is a Nash equilibrium with the desired payoff, then by 
hypothesis, exactly one of each Xi and -iXj is visited infinitely often (so that the payoff for Bi 
is not (1,1)), which defines a valuation u. Since in this Nash equilibrium, player A satisfies 
all its objectives, one state of each target is visited, which means that under valuation u, 
formula (p evaluates to true. □ 

6.4.4. Applications. We now describe examples of preorders which satisfy the conditions on 
the existence of an element v such that v' ^ 1 ^ v' < v. 

Lemma 6.19. Conjunction, counting and lexicographic preorders have an element v such 
that v' ^ 1 4^ v' <v. 

Proof. Consider v = {1,..., 1,0), and v' ^ 1. For conjunction, there is i such that v[ = 0, 
so v' < V. For counting, \{i \ n' = 1}| < n, so v' < v. For the lexicographic preorder, let i 
be the smallest index such that = 0, and either Vi = 1 and Vj = Vj for all j < i, or for all 
j € {1,..., n}, Vj = v'y In both cases v' <v. □ 

As a consequence, the result of Lemma 16.181 applies in particular to the conjunction 
and lexicographic preorders, for which the constrained NE existence problem is thus NP- 
complete. Hence we get: 

Corollary 6.20. For finite games with ordered Biichi objectives with either of the conjunc¬ 
tion or the lexicographic preorders, the constrained NE existence problem is HP-complete. 


7. Ordered reachability objectives 

In this Section we assume that preference relations of the players are given by ordered 
reachability objectives (as defined in Section [23]) , and we prove the results listed in Table [3| 
(page [3]). We will first consider the general case when preorders are given by Boolean circuits 
and we will show that the various decision problems are PSPACE-complete. We will even 
notice that the hardness result holds for several simpler preorders. We will finally improve 
this result in a number of cases. 
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For the rest of this section, we fix a game Q = (States, Agt, Act, Mov, Tab, (;^A)AGAgt), 
and we assume that is given by an ordered reachability objective cja = 

(^A)AeAgt)- 

7.1. General case: preorders are given as circuits. We prove the following result: 
Proposition 7.1. 

• For finite games with ordered reachability objectives where preorders are given by Boolean 
circuits, the value problem, the NE existence problem and the constrained NE existence 
problem are in PSPACE. 

• For finite two-player turn-based games with ordered reachability objectives where preorders 
have 1 as a unique maximal element, the value problem is PS PACE-hard. 

• For finite two-player games with ordered reachability objectives where preorders have 1 as 
a unique maximal element, and have an element v such that for every v', v' 1 ^ v' <v, 
then the NE existence problem and the constrained NE existence problem are PSPACE-hard. 

The upper bound will be proven by reduction to games with ordered Biichi objectives using 
game-simulation. 

7.1.1. Reduction to a game with ordered Biichi objectives. We show how to transform a game 
Q with preferences given by Boolean circuits over reachability objectives into a new game Q', 
with preferences given by Boolean circuits over Biichi objectives. Although the size of Q' 
will be exponential, circuit order with Biichi objectives define prefix-independent preference 
relations and thus checking condition [3] of Theorem 14.51 can be made more efficient. 

States of Q' store the set of states of Q that have already been visited. The set of 
states of Q' is States' = States x The transitions are as follows; (s, S) ^ {s', S') 

when there is a transition s —>■ s' in ^ and S' = S U {s'}. We keep the same circuits to 
define the preference relations, but the reachability objectives are transformed into Biichi 
objectives: a target set T is transformed into T' = {(s,5) \ S CiT 0}. Although the 
game has exponential size, the preference relations only depend on the strongly connected 
components the path ends in, so that we will be able to use a special algorithm, which we 
describe after this lemma. 

We define the relation s <i s' over states of Q and Q' if, and only if, s' = (s, S) with 
S C States, and prove that it is a game simulation (see Definition 15.221) . 

Lemma 7.2. The relation <\ (resp. <r^) is a game simulation between Q and Q', and it is 
preference-preserving from {so, (so, {so})) (resp. ((sq, (soj), so)J- 

Proof. Let ruAgt be a move; writing t = Tab(s, ruAgt), we have Tab'((s, 5), ruAgt) = {t,S Li 
{t}). Therefore Tab(s, m^gt) < Tab'(s', ruAgt)- Let {t. S') be a state of Q'; then we also have 
t< {t,S'). If S' = 5 U {t} then Susp((s, t), ruAgt) = Susp(((s, 5), (t, 5')), rriAgt); otherwise 
Susp(((s, S'), (t, S")), ruAgt) = 0- In both cases, condition (2) in the definition of a game 
simulation is obviously satisfied. 

In the other direction, let (s',S U {s'}) = Tab((s, S), ruAgt); we have that s' < {s',S U 
{s'}). Let t G States. Then t < {t,S Li {t}), and Susp((s, t), ruAgt) = Susp(((s, S), (t, S U 
{t})),mAgt)- Hence <i“^ is a game simulation. 

Let p and p' be two paths, from sq and (sO){'So}) respectively, and such that p < p'. 
We show preference preservation, by showing that p reaches target set T if, and only if. 
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p' visits T' infinitely often. If p visits some state s ^T, then from that point, states visited 
by p' are of the form (s', S') with s G S'; all these states are in T', therefore p' visits T' 
infinitely often. Conversely, if p' visits T' inhnitely often, then some state of T' have been 
visited by p. From this, we easily obtain preference preservation. □ 

As a corollary (Proposition 15. 2, ip we get that there is a correspondence between Nash 
equilibria in Q and Nash equilibria in O'. 

Lemma 7.3. If there is a Nash equilibrium UAgt in Q from so, then there is a Nash equi¬ 
librium cr'j^gt from (so,{so}) such that Outg(so, UAgt) < Outg/((so, {so}), -^nd 

vice-versa: if there is a Nash equilibrium in O' from (so, {so}), then there is a Nash 
equilibrium in 0 from sq such that Outg/((so, (soj), Outg(so, crAgt)- 

Note that, if Outg(so, UAgt) < Outg/((so, {sq}), u^^gt), then Outg(so, UAgt) satishes the 
reachability objective with target set T if, and only if, Outg'((so, (soj), u^^gt) satisfies the 
Biichi objective with target set T' = {(s, 5) | S’nT 7 ^ 0}. From this strong correspondence 
between Q and Q', we get that it is sufficient to look for Nash equilibria in game Q'. 

7.1.2. How to efficiently solve the suspect game of O' . In game O', preference relations are 
prefix-independent. Applying Remark 14.61 the preference relation in the suspect game is 
then also prehx-independent, and the payoff of a play only depends on which strongly- 
connected component the path ends in. We now give an alternating algorithm which runs 
in polynomial time and solves the game 'R(^', 7 r'), where vr' is an infinite path in O'■ 

Lemma 7.4. The winner of H{0',fr) can be decided by an alternating algorithm which 
runs in time polynomial in the size of 0 - 

Proof Let be the circuit defining the preference relation of player A. Let p = (sj, 5j)i>o 
be a path in O', the sequence (5j)i>o is non-decreasing and converges to a limit S{p). We 
have payofF^(p) = nS(p)= 0 }' Therefore the winning condition of Eve in 'H{0',fr) for a 

play p only depends on the limits X{p) and S{proji{p)). It can be described as a single Biichi 
condition with target set T = {{{s,S),P) \ MA G P. C^[v^{S) , w^\ evaluates to true} 
where v^{S) = l{i|T^ns= 0 } ~ payoffA('^ 0 - We now describe the algorithm. 

Initially the current state is set to ((sq, {so}), Agt). We also keep a list of the states 
which have been visited, and we initialise it with Occ {(so, {so}), Agt}. Then, 

• if the current state is {{s,S),P), the algorithm existentially guesses a move m^gt of Eve 
and we set t = ((s, S),P, niAgt); 

• otherwise if the current state is of the form ((s, S),P, ruAgt), it universally guesses a state s' 
which corresponds to a move of Adam and we set t = {{s', Sujs'}), PnSusp((s, s'), mAgt))- 

If t was already seen (that is, if f G Occ), the algorithm returns true when t £T and false 
when t ^ T, otherwise the current state is set to t, and we add t to the list of visited states: 
Occ ^ OccU{f}, and we repeat this step. Because we stop when the same state is seen, 
the algorithm stops after at most i -\- 1 steps, where I is the length of the longest acyclic 
path. Since the size of S can only increase and the size of P only decrease, we bound I with 
|Statesp • |Agt|. 

We now prove the correctness of the algorithm. First, 'H{0',fr) is a turn-based Biichi 
game, which is a special case of parity game. Parity games are known to be determined with 
memoryless strategies [aatts], hence 'H{0',t^') is determined with memoryless strategies. 
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If the algorithm returns true, then there exist a strategy of Eve such that for all the 
strategies cry of Adam, any outcome p of Out((T 3 ,(Jv) is such that there exist i < j < I + 1 
with Pi = pj ^ T and all pk with k < j are different. We extend this strategy <73 to a 
winning strategy for Eve. We do so by ignoring the loops we see in the history, formally 
we inductively define a reduction r of histories by: 

• r(e) = e; 

• if ((s, S),P) does not appear in r{h) then r{h ■ {{s, S),P)) = r{h) ■ ((s, S), P); 

• otherwise r[h ■ {(s,S),P)) = r{h)<i where i is the smallest index such that r{h)i = 
((s,P),P). 

We then define for any history h by (T'^{h) = a^{r{h)). 

We show by induction that if is a history compatible with a'^ from ((sq, {sq}), Agt) 
then r{h) is compatible with from ((sq, {•so})) Agt) . It is true when h = ((sq, {so}); Agt), 
now assuming it holds for all history of length < k, we show it for history of length k + 1. 
Let h ■ s he a history of length k + 1 compatible with a'-^. By hypothesis r{h) is compatible 
with h and since = a 3 {r{h)), r{h) ■ s is compatible with If r{h ■ s) = r{h) ■ s then 

r{h ■ s) is compatible with < 73 . Otherwise r{h • s) is a prefix of r{h) and therefore of length 
< k, we can apply the induction hypothesis to conclude that r{h ■ s) is compatible with < 73 . 

We now show that the strategy < 7 ^ that we defined, is winning. Let p he a possible 
outcome of < 7 ^, let i < j be the first indexes such that Pi,Pj G (States x S{p)) x X{p) and 
Pi = pj. Because there is no repetition between i and j — 1: r{p<j-i) = • • • Pj-i- 

We have that ( 73 (r(/?<j_i)/?i • • •= a'-^{pj-i). From this move, pj is a possible next 
state, so r{p<i-i)pi ■ ■ ■ Pj is a possible outcome of < 73 . As pi = pj and all other states are 
different, by the hypothesis on <73 we have that pj G T. This shows that p ultimately loops 
in states of T and therefore p is a winning run for Eve. 

Reciprocally, if Eve has a winning strategy, she has a memoryless one 173 since this is a 
Biichi game. We can see this strategy as an oracle for the various existential choices in the 
algorithm. Consider some universal choices in the algorithm, it corresponds to a strategy 
(7v for Adam. The branch corresponding to (< 73 , uv) ends the first time we encounter a loop, 
we write this history h ■ h' with last(/i') = last(L). Since the strategy <73 is memoryless, 
h ■ h'^ is a possible outcome. Since it is winning, last(/i') is in T and therefore the branch 
is accepting. This being true for all the branches given by the choices of < 73 , the algorithm 
answers true. □ 

7.1.3. Proof of the PSPACE upper bounds in Proposition |7. We describe a PSPACE algo¬ 
rithm for solving the constrained NE existence problem. The algorithm proceeds by trying 
all plays tt in ^ of the form described in Proposition 13.11 This corresponds to a (unique) 
play tt' in G'. We check that n' has a payoff satisfying the constraints, and that there is a 
path p in ^{{G',^'), whose projection is tt' , along which Adam obeys Eve, and which stays 
in the winning region of Eve. This last step is done by using the algorithm of Lemma 17.41 
on each state p goes through. All these conditions are satisfied exactly when the conditions 
of Theorem 14.51 are satisfied, in which case there is a Nash equilibrium within the given 
bounds. 

The PSPACE upper bound for the value problem can be inferred from Proposition 13.21 
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7.1.4. Proof of PSPACE-hardness for the value problem. We show PSPACE-hardness of the 
value problem when the preorder has 1 as a unique maximal element. 

We reduce QSAT to the value problem, where QSAT is the satisfiability problem for 
quantified Boolean formulas. For an instance of QSAT, we assume without loss of generality 
that the Boolean formula is a conjunction of disjunctive clause^. 

Let (f = QiXi... QpXp. (f', where Qi E {V, 3} and (/>' = ci A • • • Ac„ with q = Vi<j <3 ^i,j 
and iij E {xk,-'Xk \ ^ < k < p} U {T,_L}. We define a turn-based game G{4>) in the 
following way (illustrated in Example 17.61 belowl. There is one state for each quantifier, one 
for each literal, and two additional states T and T; 

States = {Qk | 1 < A: < p} U {xk, -'Xk | 1 < A: < p} U {T, T}. 

The game involves two players, A and B. The states T, and T, the existential-quantiher 
states and the literal states are all controlled by A, while the universal-quantifier states 
belong to player B. For all 1 < A; < p, the state corresponding to quantifier has two 
outgoing transitions, going to x^ and -iXfc respectively. Those two literal states only have 
one transition to the next quantifier state Qk+i, or to the final state T if A: = p. Finally, 
states T and T carries a self-loop (notice that T is not reachable, while T will always be 
visited). 

Player A has one target set for each clause: if Cj = then = {A,j | 1 < 

j < 3}. The Tth objective Gif is to reach target set The following result is then 

straightforward; 

Lemma 7.5. Formula cf) is valid if, and only if, player A has a strategy whose outcomes 
from state Qi all visit each target set . 

Proof. We begin with the direct implication, by induction on p. For the base case, (p = 
QiXi. f\j^Ci where c* only involves xi and -ixi. We consider two cases: 

• Qi = 3: since we assume 4> be true, there must exist a value for xi which makes all clauses 
true. If this value is T, consider the strategy cry of Player A such that aTiQi) = xi. Then 
each clause Cj must have xi as one of its literals, so that the objective Gif is satisfied with 
this strategy. The same argument applies if the value for x\ were T. 

• Qi = V: in that case, Player A has only one strategy. For both xi and -^xi all the clauses 
are satisfied. It follows that each clause Cj must contain xi and ^xi, so that objective Gif 
is satisfied for any strategy of player B. 

Now, assume that the result holds for all QSAT instances with at most p — 1 quantifiers. 

• \i Qi = 3, then one of Q 2 X 2 • •. QpXpCpfxi ■<— T] and Q 2 X 2 • • • QpXp(p'[xi ■(— T] is valid. 
We handle the first case, the second one being symmetric. For a literal \k E {xk,^Xk}, 
we write for the set of target sets Tf^ such that the clause Cj contains the literal A^. 

Assume Q 2 X 2 ... QpXpCpfxi ■<— T] is valid; by induction we know that there exists a 
strategy such that all the targets in are visited along any outcome from state Q 2 
(because G{Q 2 X 2 ■ ■ ■ QpXp4>'[xi •(— T]) is the same game as Gif’), but with Q 2 as the initial 
state, and with the targets in containing {T} in place of xi). We define the strategy a 
by aiQi) = xi and a{Qi ■ xi ■ p) = cj^Ap)- outcome of a will necessarily visit xi, 
hence visiting all the targets in because cr follows all the objectives not in Tx.^ 
are met as well. 

®With the convention that an empty disjunction is equivalent to T. 
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• if Qi = V, then Q 2 X 2 ■ ■. QpXp(t)'[xi ^ T] is valid. Using the induction hypothesis we 

know that from Q 2 there is a strategy that enforces a visit to all the targets in Tx^. 
Similarly, Q 2 X 2 ■ ■ ■ QpXp<i>'{xi ■<— _L] is valid, and there is a strategy that visits all the 
objectives not in T^x^- We define a new strategy a as follows; a{Qi ■ xi- p) = and 

a{Qi ■ -ixi • p) = (p). Consider an outcome of cr: if it visits xi, then all the objectives 

in Tx-^ are visited, and because the path follows the objectives not in Tx-^ are also 
visited. The other case is similar. 

We now turn to the converse implication. Assume the formula is not valid. We prove 
that for any strategy a of player A, there is an outcome p of this strategy such that some 
objective is not satisfied. We again proceed by induction, beginning with the case 
where n = 1 . 

• if Qi = 3, then both ■<— T] and <^^[xi ■<— T] are false. This entails that one of the 
clauses only involves T (no other disjunction involving xi and/or -ixi is always false), 
and the corresponding reachability condition is T, which is not reachable. 

• if Qi = V, then one of ^ T] and T] is false. In the former case, one of 

the clauses Cj contains -■xi, or only contains T. Then along the run Qi ■ xi • T^, the 
objective T/^ is not visited. The other case is similar. 

Now, assuming that the result holds for formulas with n — 1 quantifiers, we prove the 
result with n quantifiers. 

• if Qi = 3, then both Q 2 X 2 ■ ■ ■QpXp(l)'\xi ^ T] and Q 2 X 2 ■ ■ ■ QpXp(t)'[xi ^ T] are false. 
Ising the induction hypothesis, any run from <52 fails to visit some objective not in 
Tx^ U T^xi- Hence no strategy from Qi can enforce a visit to all the objectives. 

• if = V, then one of Q 2 X 2 ■ ■ ■ QpXp(l)'\xi T] and Q 2 X 2 ■ ■ ■ QpXp(j)'\xi •<— T] is false. We 

handle the first case, the second one being symmetric. By induction hypothesis, for any 
strategy a of player A in the game Q{(t)'\xi ■<— T]), one of the outcome fails to visit all the 
objective not in Tx^. Then along the path p = Qi ■ xi ■ p', some objectives not in Tx^ are 
not visited. □ 

We can directly conclude from this lemma that the value of the game for A is 1 (the unique 
maximal payoff for our preorder) if, and only if, the formula </> is valid, which proves that 
the former problem is PSPACE-hard. 

Example 7.6. As an example of the construction, let us consider the formula 

(f) = Vxi. 3x2. VX 3 . 3 x 4 . {xi V - 1 X 2 V -'X 3 ) A (xi V X 2 V X 4 ) A (- 1 X 4 V T V T) (7T) 

The target sets for player A are given by = {xi;- 1 X 2 ; “'X 3 }, = {xi;x 2 ;x 4 }, and 

= {- 1 X 4 ; T}. The structure of the game is represented in Figure [25l B has a strategy 
that falsifies one of the clauses whatever A does, which means that the formula is not valid. 


7.1.5. Proof of PSPACE-hardness for the (constrained) NE existence problem. We will now 
prove PSPACE-hardness for the NE existence problem, under the conditions specified in the 
statement of Proposition l7.il using Proposition 13.41 We specify the new preference relation 
for the construction of Section [331 We give B one objective, which is to reach si (si is the 
sink state introduced by the construction). In terms of preferences for A, going si should be 
just below visiting all targets. For this we use the statement in Proposition 17.11 that there 
is V such that for every u', u' / 1 u' < u, and add si as a target to each such that 
Vi = 1. This defines a preference relation equivalent to the one in the game constructed 


PURE NASH EQUILIBRIA IN CONCURRENT DETERMINISTIC GAMES 


63 


player A 



player B 



Figure 25. Reachability game associated with the formula (|7.ip 

in Section 13.31 therefore we deduce with Proposition 13.41 that the NE existence problem is 
PSPACE-hard. 

7.1.6. Applications. We should now notice that conjunction, counting and lexicographic 
preorders (thanks to the fact that 1 is the unique maximal element for theses orders and 
to Lemma l6.19p . As conjunction (for instance) can easily be encoded using a (monotonic) 
Boolean circuit in polynomial time, the hardness results are also valid if the preorder is 
given by a (monotonic) Boolean circuit. Finally the subset preorder can be expressed as a 
polynomial-size Boolean circuit and has a maximal element. We therefore get the following 
summary of results: 

Corollary 7.7. 

• For finite games with ordered reachability objectives, with either the conjunction, the 
counting or the lexicographic preorder, the value problem, the NE existence problem and 
the constrained NE existence problem are PSPkQE-complete. 

• For finite games with ordered reachability objectives, where the preorders are given by 
(monotonic) Boolean circuits, the value problem, the NE existence problem and the con¬ 
strained NE existence problem are PSPkCP-complete. 

• For finite games with ordered reachability objectives, with the subset preorder, the value 
problem is PSPkCP-complete. 

On the other hand, the disjunction and maximise preorders do not have a unique maximal 
element, so the hardness result does not carry over to these preorders. In the same way, for 
the subset preorder, there is no v such that u' 7 ^ 1 <;=> u' < u, so the hardness result does not 
apply. We prove later (in Section [72]) that in these special cases, the complexity is actually 
lower. 

7.2. Simple cases. As for ordered Biichi objectives, for some ordered reachability objec¬ 
tives, the preference relation can be (efficiently) (co-)reduced to a single reachability objec¬ 
tive. We do not give the formal definitions, they can easily be inferred from that for Biichi 
objectives on page HHl 

Proposition 7.8. 

• For finite games with ordered reachability objectives which are reducible to single reachabil¬ 
ity objectives and in which the preorders are non-trivial, the value problem is P-complete. 

• For finite games with ordered reachability objectives which are co-reducible to single reach¬ 
ability objectives, and and in which the preorders are non-trivial, the NE existence problem 
and the constrained NE existence problem are NP-complete. 
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Proof. Since P-hardness (resp. NP-hardness) already holds for the value (resp. NE existence) 
problem with a single reachability objective (see [23l Sect. 2.5.1]), we only focus on the upper 
bounds. 

We begin with the value problem: given a payoff vector u for player A, we build the new 
target set T in polynomial time, and then use a classical algorithm for deciding whether 
A has a winning strategy (see m Sect. 2.5.1]). If she does, then she can secure payoff u. 

Consider now the constrained NE existence problem, and assume that the preference 
relation for each player A is given by target sets {Tf^)i<i<nA- The NP-algorithm consists 
in guessing the payoff vector {vA)A£Agt and an ultimately periodic play p = vr • with 
l^r], Jr] < jStatesj^, which, for each A, visits if, and only if, vf = 1. We then co-reduce 
the payoff to a new target set for each player A. 

The run p is the outcome of a Nash equilibrium with payoff {vA)AeAgt for the original 
preference relation if, and only if, p is the outcome of a Nash equilibrium with payoff 0 with 
the single reachability objective for each A G Agt. Indeed, in both cases, this is 

equivalent to the property that no player A can enforce a payoff greater than v^. Applying 
the algorithm presented in Section 15.11 this condition can be checked in polynomial time. 

□ 

We now see to which ordered objectives this result applies. It is not difficult to realise 
that the same transformations as those made in the proof of Lemma 16.51 can be made as 
well for reachability objectives. We therefore get the following lemma, from which we get 
the remaining results in Table [3l 

Lemma 7.9. Ordered reachability objectives with disjunction or maximise preorders are 
reducible to single reachability objectives. Ordered reachability objectives with disjunction, 
maximise or subset preorders are co-reducible to single reachability objectives. 

We conclude with stating the following corollary: 

Corollary 7.10. 

• For finite games with ordered reachability objectives, with either the disjunction or the 
maximise preorder, the value problem is P-complete. 

• For finite games with ordered reachability objectives, with either the disjunction, the max¬ 
imise or the subset preorder, the NE existence problem and the constrained NE existence 
problem are UP-complete. 


8 . Conclusion 

Summary and impact of the results. Concurrent games are a natural class of games, ex¬ 
tending classical turn-based games with more complex interactions. We have developed 
a complete methodology, involving new techniques, for computing pure Nash equilibria in 
this class of games. We were able to characterise the complexity of Ending Nash equilibria 
(possibly with constraints on the payoff) for simple qualitative objectives first (Section [5]), 
and then for semi-quantitative objectives (Section [6] and El). We would like to point out 
that the algorithm for Biichi objectives with maximise preorder (see Section f6.2p has been 
implemented in tool Pralin^ [8] 

^Available on http://www.lsv.ens-cachcin.fr/Software/praline/ 
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We believe the methodology we have developed in this paper can be used in many 
other contexts, and the suspect game is a very powerful tool that will allow to analyze 
various properties of multi-agent systems. Indeed, the correspondence between pure Nash 
equilibria in the original game and winning strategies in the suspect game holds with no 
assumption on the structure of the game. In particular it can be applied to games given as 
pushdown systems, counter systems, etc. Also it does not assume anything on the preference 
relations, only the resulting winning condition in the suspect game can become very complex 
if the preference relations are complex. Now the matter is just algorithmics, in that we 
have to solve a two-player turn-based game in a potentially complex arena (if the original 
game structure is complex) with a potentially complex winning condition (if the preference 
relations are complex). 

The suspect game construction can also be adapted to compute many other kinds of 
equilibria; this is for instance applied to robust equilibria in [9]. We believe this can be used 
in many other contexts. 

We have also developed in this paper another tool that might have its own interest and 
be useful in some other contexts: the game-simulation (see Section l5.7.1|) . We used this 
tool several times (for handling objectives given by deterministic Rabin automata, but also 
for handling ordered reachability objectives). This tool can also be used to handle more 
complex game structures, like we did in [5] for timed games, when we originally introduced 
this notion. In particular, the construction done in [5] shows that we can compute Nash 
equilibria for timed games with all kinds of objectives studied in the current paper. 

Our future researches will include extending the use of the suspect game abstraction for 
other families of games, and to push it further to also handle truly quantitative objectives. 

Discussion on the various hypotheses made in this paper. We have assumed strategies are 
pure, and game structures are deterministic. This is indeed a restriction, and allowing for 
randomised strategies would be of great interest. Note however that pure Nash equilibria are 
resistant to malicious randomised players (that is, to deviations by randomised strategies). 
There is no obvious way to modify the suspect game construction to handle either stochastic 
game structures or randomised strategies. Indeed, given a history, it is hard to detect 
strategy deviations if they can be randomised, and therefore the set of suspects is hard to 
compute (and actually even dehne). This difficulty is non-surprising, since the existence of a 
Nash equilibrium in pure or randomised strategies is undecidable for stochastic games with 
reachability or Biichi objectives [l2], and the existence of a Nash equilibrium in randomised 
strategies is undecidable for deterministic games [JT]. However we would like to exhibit 
subclasses of stochastic games for which we can synthesize randomised Nash equilibria, this 
is part of our research programme. 

We have assumed that strategies are based on histories that only record states which 
have been visited, and not actions which have been played. We believe this is more relevant 
in the context of distributed systems, where only the effect of an action might be visible 
to other players. Furthermore, this framework is more general than the one where every 
player could see the actions of the other players, since the latter can easily be encoded in 
the former. In the context of complete information (precise view of the actions), computing 
Nash equilibria is rather easy since, once a player has deviated from the equilibrium, all 
the other players know it and can make a coalition against that player. To illustrate that 
simplihcation, we only mention that the constrained NE existence problem falls in NP for 
finite games with single parity objectives (we can obtain this bound based on the suspect 




66 


P. BOUYER, R. BRENGUIER, N. MARKEY, AND M. UMMELS 


game construction), if we assume that strategies can observe actions, whereas the problem 
is Py''^-hard if strategies do not observe the actions. 

Finally we have chosen the framework of concurrent games, and not that of turn-based 
games as is often the case in the literature. Concurrent games naturally appear when 
studying timed games [5] (the semantics of a timed game is that of a concurrent game, 
and the abstraction based on regions that is correct for timed games is concurrent), and in 
the context of distributed systems, concurrent moves are also very natural. In fact turn- 
based games are even a simpler case of concurrent games when we assume strategies can 
see the actions. Of course, the suspect game construction applies to turn-based games, 
but becomes quite simple (as is the case if strategies do see actions), since the set of 
suspect players is either the set Agt of all players (this is the case as long as no player 
has deviated from the equilibrium), or reduces to a singleton, as soon as a player has 
deviated. To illustrate this simplification, we notice that in the turn-based finite games, 
the constrained NE existence problem is NP-complete for single parity objectives [lO] (it is 
Py'*^-complete in hnite concurrent games). 
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Appendix: Proof of Proposition 15.21 

We show PS PAC E-hardness of the constrained existence of a Nash equilibrium for various 
kinds of qualitative objectives, using an encoding of the satisfiability of a QSAT formula 
■0 = Vxi- 3x2- ■ ■. Vxp_i. where each Cj is of the form V A ,2 V A ,3 and 

e {xk,^Xk \ l<k <p}. 

We construct a game = (States, Agt, Act, Mov, Tab, (;^A)AeAgt) as follows: States = 
{M,'u;}uUfce[i,p]{sfc,4,/fc,4,efc}uUig|i,n]{&i,Ci}; Agt = {Eve} U Ufce[i,p]{^fc, ^fc}; Act = 
{0,1,2}. We now define the transition table (the structure of the game is represented in 
Figure [Mil. 



Figure 26. Encoding of a QSAT formula into a game with succinct repre¬ 
sentation of the transition formula. Dotted edges correspond to the strategy 
prohle that in each states selects action 0 for every player. 

• If A: < p is odd, then in state Sk, the transition function is given by0 

(( (Afc/= 1) (g) (i?fc/ = l),4), 

l<k'<p l<k'<p^k'^k 

( (g) (Afc. =0)® (g) (Sfc^ = 0),/fc), (T,r()) 

l<k'<p,k'^k l<k'<p 

In the first part, the coalition of all the players except Eve and Bk takes the decision to 
go to tk, and any of those players can switch her action and enforce state tk (meaning 
that Xk is set to true); if the move to state tk is not chosen, then the coalition of all 
players except Eve and Ak takes the decision to go to fk, and any of those players can 
switch her action and enforce state fk (meaning that Xk is set to false); otherwise the 
game goes to state u. 

In states tk and fk, the transition function is (T,Sfc_|_i). 

• If A: < p is even, then in state Sk the transition function is given by ((Eve = 1, tk), (T, fk))- 
Eve decides the value of variable Xk (state tk corresponds to setting Xk to true, and 
state fk corresponds to setting variable Xk to false). 

^^The operator evaluates the parity of the number of subformulas that are true: oih is true iff 

\{ah I ah evaluates to true}| is odd. 
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In state tk, the transition function is given by 

(( V = 1) V V {Bk' = l),Sk+i), (T,u)^ 

l<k'<p l<k' <p^k'^k 

with Sp+i = bi: any player except Eve and can decide to go to state by playing 
action 1; otherwise the game proceeds to state u. Intuitively, any of the above players 
can “validate” the previous choice of Eve having set Xk to true. 

In state fk, the transition function is given by 

(( V {Ak>=0)\/ V (Sfc, =0),Sfc+i),(T,u)) 

l<k'<p,k'^k l<k'<p 

with Sp+i = bi: any player except Eve and Ak can decide to go to state by playing 
action 1; otherwise the game proceeds to state u. Intuitively, any of the above players 
can “validate” the previous choice of Eve having set Xk to false. 

• If i < n, in the transition function is given by 

(( (g) ((Afc = l)®(Sfc = l)),Ci),(T,6,+i)) 

l<k<p 

with bn+i = u. Intuitively the coalition of all players except Eve can decide to go to 
state Cj, which will mean that they want to check the truth of clause Cj. Moreover, any 
of those players can switch her action and decide by her own to check this clause. 

• If i < n, in Cj, the transition function is given by 

(^(Eve = (Eve = 2,d{ii^2)), (T,d(£i, 3 ))^ 

where for all 1 < A: < p, d{xk) = dk and d{^Xk) = e^. Intuitively Eve proves the current 
clause is satisfied by pointing to the literal which is set to true. 

In state dk {1 < k < p), the transition function is given by [{Bk = 1, rc), (T, u)). 
Player Bk decides to go to u or w. 

In state Cfc (1 < /c < p), the transition function is given by [{Ak = 1, w), (T, u)). 

Intuitively, in the game we have just dehned, Eve will be in charge of properly choosing 
the value of the existentially quantified variables in The value of the variables will be 
given by the history (visiting tk means variable Xk is set to true, whereas visiting fk means 
variable Xk is set to false). Then, player Ak will be in charge of witnessing that variable Xk 
is set to true, whereas player Bk will be in charge of witnessing that variable Xk is set to 
false. Their role will be clearer in the proof. 

The objective for each player Ak, Bk is to reach state w, and for Eve to reach state u. 
This is naturally a reachability objectives but can also be encoded as a Biichi objective or 
a safety objective where the goal is to avoid state u for Ak and Bk, and avoid v for Eve. 

We will show that there is a Nash equilibrium in where Eve wins if, and only if, ■0 
is valid. 

Before switching to the proof of this equivalence, we dehne a correspondence between 
(partial) valuations and histories in the game, with a partial valuation v: {xi,... ,Xk} —> 
{true, false}, we associate the history h(u) = S1W1S2W2 ■ ■ ■ WkSk+i where for all 1 < k' < k, 
wy = ty (resp. wy = fy) if v{xy) = true (resp. v{xy) = false). Conversely with 
every history h in Q^, we associate the partial valuation v/j: (xi,..., x^} ^ (true, false} 
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such that state s^+i (with Sp+i = bi) is the latest such state appearing along h, and 
'^h{xk') = true (resp. false) if h visits ty (resp. fy), for all 1 < k' < k. 

Assume formula V’ is valid. For all players and we set strategies cjaj, and 
to always play action 2. We now turn to the strategy for Eve. Consider a history h = 
Si • • • Wk-i'Sk where k < pis odd. Let v' be the valuation where v'{xy) = 'Jh{xy) for all k' < 
k, and v'{xk) = 1. We set cTEve(^) to be 1 if v' makes the formula Vx^+i. ... drcp. Ai<i<n 
valid, and 0 otherwise. Since tp is valid, one of the two choices makes the rest of the formula 
true. This ensures that a history that reaches bi and that is compatible with ciEve will define 
a valuation that makes Ai<i<n true. Fix a history h that is compatible with ciEve and 
ends up in some state cp. the strategy of Eve is to go to d{iij) where the literal Aj makes 
the clause q true under valuation v/j. For all other histories, we set the strategy of Eve to 
be 2. 

We show that the strategy profile ciAgt = (fEve, ^B,,)i<k<p) is a Nash equilibrium. 
First notice that the outcome of UAgt is si • u (since all players and play action 2); 
Eve wins, and all the other players lose. We now describe interesting deviating strategies 
for the players Ak or By. 

• Consider a deviating strategy for player Ay. let h G Out^(cJAgt[Afc eA if 

a'A^{h) = 2, then Out(cJAgt[Afc i-a ends up in state u] therefore an interesting 

deviating strategy should choose some value 0 or 1 after any history. Now if k' is odd 
with k' / k, then from sy, player A^ can choose to go to ty (action 1) or fy (action 0). 
If k is odd, then the only way not to end up in u from Sk is to choose action 1 which leads 
to state tk. Now at state ty with k' even, should validate the choice of Eve (that is, 
play action 1 in ty - meaning that variable xy has value true). At state fy with k' even, 
a k' k, should validate the choice of Eve (that is, play action 0 in fy - meaning 
that variable xy has value false). At state fk if k is even, nothing can be done which 
could be profitable to player Ay. state u will be reached. 

• A similar reasoning can be done for player By. the only difference is at state Sk when k 
is odd, where player Bk can only choose action 0 and go through fk. 

• In the part of the game after bi, each player can deviate and choose to go to some state cp 
this choice will be made for checking the truth of clause q under the valuation that has 
been fixed by the history so far. 

Consider a deviation of some player that moves to q, and write h for the corresponding 
history up to state q. The strategy of Eve after h is to go to where Aj sets Cj 

to true under valuation v/^. If d{£ij) = Xk, then (a) this means that \/h{xk) = true, and 
(b) the next state is controlled by player B^. Using the characterization of interesting 
deviating strategies above, it cannot be the case that player B^ is the deviating player 
since from tk (which is visited by h), if only B^ deviates, the game unavoidably goes to 
state u. Hence, for every strategy for player Bi^, history h cannot be an outcome of 
cJAgt[Hfc eA cr'^J. In particular, no deviation of player B^ can lead to state w. Similarly, if 
d{£ij) = ^Xm, the outcome ends up in u. In other words, each time a player other than Eve 
changes her strategy, the outcome ends up in u, yielding no improvement for the player. 

Hence no player can improve her outcome by changing unilaterally her strategy, which 
shows that the strategy profile CAgt is a Nash equilibrium where Eve wins. 
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Now assume there is a Nash equilibrium <TAgt in which Eve wins. Let u be a valuation 
such that for every 2 <k' <p even, 


= 1) 4^ (p(xfc/) = true) (#) 

where is the valuations restricted to {xi,... ,Xfe'_i}. We show the following 

two properties: 

• if v{xk) = true then there is a strategy for s.t. h(s) € Out^(fTAgt 

• if v{xk) = false then there is a strategy for s.t. h(s) € Out^(crAgt[.Bfc i-A o'bJ)- 

We show the result by induction on the number of atomic propositions. For zero atomic 
propositions, the result obviously holds. Assume the result holds for atomic propositions 
{xi,... ,Xh-i} {h < p). Let s be a valuation over {xi,... ,Xh}, and k such that v{xk) is 
true. Define v' as the restriction of v to atomic propositions {xi,... ,Xh-i}- By induction 
hypothesis, h(u') = si • tci • • • Wh-i ■ Sh is an outcome of some strategy for player Ak- 

• If /i is odd. Let ruAgt = crAgt(h('yO)- We set cr'y^^{h{v')) to be 1 if ®i<k'<p,k'<k{''^Ay = 
l)®0i<A:'<p k'^ki'^^By = 1) IS different from v{xh), and to be 0 otherwise. Then we have 
that the next state is if, and only if, v{xh) is true. 

• If h is even, then the state after Sh (actually after h(u')) is if v{xh) is true, and fh 
otherwise, and this cannot be changed by player A^. Then in th and fh we set cr(^^(h(u')th) 
(resp. cr'j^^{\r\{v')fh)) to be 1. Note that since u(xfc) is true we cannot reach f^, hence 
setting the action of A^ in those states to 1 always ensures that the next state is Sh+i- 

This shows that h(u) € Out^((TAgt[^fc '-A ^'aJ) some strategy 
The second property can be proven similarly for player B^. 


Let u be a valuation satisfying condition ([^. We show that V' evaluates to true under 
that valuation. Let Cj be a clause of f), and let j = cTEve(h('P) • 6i • • • 6/ • c;). We show 
that v{iij) = true, which means that c* evaluates to true under v. This will show that 
formula if is valid since condition (^) defines sufficiently many witness valuations. Assume 
w.l.o.g. that iij = Xfc. Assume towards a contradiction that v{xk) = false. We have proven 
that there is a strategy for player B^ such that h(u) Ti • • • € Out^(crAgt[Bfc i-a 

Now, the state Xk is controlled by player Bk, so Bk can enforce a visit to w from Xk, so 
there is a strategy for player Bk such that h(u) • 6i • • • 6; • c; • x^ € Out^(crAgt[Bfc i-A 
This contradicts the fact that UAgt is a Nash equilibrium. We conclude that v{xk) = true, 
and we conclude that ip is valid (as explained above). □ 
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